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1.  Introduction 


The  traditional  paradigm  for  the  description  and  understanding  of  the  nature  of  command  and 
control  (C2)  system  (C2S)  operations  and  perfonnance  within  the  U.S.  Army  is  currently  under¬ 
going  a  radical  change.  The  U.S.  Army  Field  Manual  (FM)  6-0  (Army,  2003)  defines  the  C2S  as 
“the  arrangement  of  personnel,  information  management,  procedures,  and  equipment  and  facilities 
essential  for  the  commander  to  conduct  operations.”  Tactical  battlefield  C2  is  an  extremely  com¬ 
plicated  action  to  orchestrate  and  conduct  in  an  effective  manner  in  its  own  right.  However,  with 
the  introduction  of  new  information  systems  such  as  the  Army  Battle  Command  System  (ABCS) 
(Army,  2002),  sophisticated  new  weapons  now  exist  with  unprecedented  capabilities  for  lethality 
and  requirements  for  battlefield  integration.  As  they  are  contributing  to  a  total  reorganization  of 
force  structures  into  the  new  modularity  concept,  the  need  for  effective  understanding  of  how  this 
system  can  work  effectively  as  a  system  entity  increases  exponentially.  The  fact  is  that  the  com¬ 
plexity  of  the  modem  C2S  has  surpassed  the  ability  for  an  intuitive  understanding  of  how  indivi¬ 
dual  components  or  subsystems  can  improve  or  degrade  the  operation  of  the  overall  system.  This 
situation  poses  the  question  of  how  to  further  develop  and  improve  the  performance  of  the  C2S 
without  making  changes  that  might  actually  degrade  its  effectiveness.  From  this  it  becomes 
apparent  that  some  systematic  approach  is  needed  to  predict  and  evaluate  the  effects  that  changes, 
additions,  and  improvements  in  this  system  will  have  on  its  overall  ability  to  conduct  battle  space 
management. 

1.1  The  Command  and  Control  System’s  Demands  for  Decision  Making 

Previous  research  on  this  topic  (Middlebrooks,  2003;  Middlebrooks  et  ah,  1999a;  Wojciechowski, 
Plott,  &  Kilduff,  2005)  has  developed  a  paradigm  for  the  systematic  evaluation  of  the  C2S  from 
the  system  level  viewpoint.  A  basic  premise  of  this  approach  is  that  all  observable  characteristics 
of  live  tactical  operations  centers  (TOCs)  in  the  field  can  be  used  in  the  development  of  quantita¬ 
tive  predictive  models  of  various  system  components  for  use  in  a  simulation  of  the  complete  C2S. 
Some  of  these  characteristics  include  such  things  as  the  quantity  and  quality  of  communications 
messages  through  the  TOC  from  various  digital  systems,  quality  and  timeliness  of  intelligence 
information  about  the  enemy,  numbers  and  expertise  levels  of  team  members  present  in  the  TOC 
at  any  given  time,  individual  and  group  interactions,  physical  setting  of  the  TOC,  and  environ¬ 
mental  conditions,  to  name  a  few.  However,  a  striking  limitation  of  this  approach  is  in  its  inability 
to  simulate  and  predict  cognitive  performance  of  individuals  and  teams  in  areas  such  as  situation 
awareness,  knowledge  elicitation,  error  generation,  individual  versus  team  performance,  and 
decision  making.  An  ability  to  predict  the  optimal  decision  required  for  success  can  be  extremely 
useful  as  a  perfonnance  measure  for  use  in  describing  the  overall  effectiveness  of  the  system. 
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1.2  The  Model  of  Optimal  Decision  Making 


This  research  integrates  basic  research  in  decision  making  that  is  being  conducted  at  the 
Psychology  Department  of  the  University  of  Texas  with  applied  research  in  unit  of  action  (UA) 
TOC  operations  being  conducted  at  the  Fort  Hood  Field  Element  of  the  U.S.  Army  Research 
Laboratory’s  (ARL’s)  Human  Research  and  Engineering  Directorate.  Initial  work  on  this  topic 
(Middlebrooks  &  Stankiewicz,  2005)  was  supported  by  a  grant  from  the  Congressionally  funded 
University  XXI  program  in  a  partnership  between  the  faculty  and  staff  of  The  University  of  Texas 
at  Austin  through  the  Institute  for  Advanced  Technology  and  ARL.  It  is  the  goal  of  this  research 
to  develop  predictive  simulations  of  the  C2S  UA  performance  that  can  be  used  in  the  evaluations 
of  changes  in  the  system  or  the  addition  or  modification  of  system  subcomponents.  For  example, 
what  might  the  effect  be  on  the  overall  ability  of  the  UA  TOC  to  conduct  battle  space  management 
from  the  addition  of  a  new  intelligence  system  that  allows  infonnation  about  the  enemy  to  have  a 
maximum  age  of  1  hour  before  it  becomes  obsolete  versus  a  maximum  age  of  4  hours?  One 
intuitive  conclusion  that  could  be  deduced  from  this  new  capability  is  that  it  would  significantly 
increase  the  commander’s  understanding  of  the  enemy  situation  because  the  information  is  always 
more  current  than  before.  However,  this  intense  stream  of  information  might  cause  the  com¬ 
mander  to  become  more  focused  on  the  instantaneous  situation  on  the  battlefield  and  lose  situation 
awareness  of  longer  term  developments  with  a  resulting  degradation  in  the  ability  to  make 
effective  decisions  about  how  to  react  to  the  threat.  The  effective  ability  of  predictive  simulations 
of  these  types  of  environments  is  based  on  how  well  they  account  for  the  myriad  of  variables 
stemming  from  physical  activities  and  the  human’s  cognitive  ability  to  react  to  those  variables. 
This  current  research  is  a  first  step  in  allowing  simulations  of  system  perfonnance  to  account  for 
limitations  in  human  cognitive  performance  abilities. 


2.  Method 


The  methodology  used  in  the  operation  of  this  model  involves  two  components.  The  first  is 
defined  as  the  “belief  vector”  (BV).  This  represents  the  current  knowledge  that  the  operator  has 
about  the  system.  Using  this  knowledge,  the  operator  decides  what  to  do  next  or  what  action  to 
take  in  the  pursuit  of  the  mission  goal.  This  model  can  be  generalized  to  many  different  decision- 
oriented  situations  where  the  operator  is  confronted  with  a  goal-directed  task.  In  the  pursuit  of  this 
task,  the  operator  may  decide  to  seek  information  about  the  condition  of  the  current  situation,  s/he 
may  decide  to  take  some  action  to  achieve  the  goal,  and  at  some  point,  s/he  will  decide  that  s/he 
has  achieved  the  goal,  thus  tenninating  the  decision  mission  sequence.  An  example  of  this 
scenario  is  provided  in  the  hospital  emergency  room  where  a  doctor  is  faced  with  the  mission  goal 
of  successfully  treating  a  sick  patient.  The  doctor  may  administer  a  medication  (take  an  action)  or 
may  perform  a  medical  test  (seek  information)  to  attempt  to  identify  the  patient’s  condition.  A 
complete  series  of  medications  and  tests  may  be  perfonned  before  the  doctor  achieves  the  belief 
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that  the  patient  has  been  cured.  If  the  doctor  terminates  the  sequence  before  the  patient  is  cured, 
the  patient  may  die.  If  the  doctor  prolongs  the  sequence  beyond  the  point  when  the  patient  is 
cured,  a  substantial  unjustified  cost  is  the  result.  This  medical  sequence  is  an  illustration  of 
gathering  information  and  taking  actions  until  a  belief  is  achieved  that  the  goal  has  been  reached. 
Another  example  is  provided  in  the  military  context  where  a  ground  force  commander  is  given  an 
order  to  seek,  find,  and  destroy  an  enemy  that  is  at  an  unknown  location.  The  commander  may 
seek  information  (e.g.,  by  flying  an  unmanned  aerial  vehicle  [UAV]  reconnaissance  mission)  or 
s/he  may  take  a  direct  action  to  destroy  the  enemy  by  firing  artillery  at  a  suspected  location 
occupied  by  the  enemy.  An  entire  sequence  of  UAV  and  artillery  missions  may  be  performed  in 
some  goal-directed  order  until  the  commander  believes  that  the  enemy  has  been  destroyed.  At  this 
point,  the  commander  decides  that  the  mission  is  complete  because  the  enemy  is  believed  to  be 
destroyed  and  decides  to  terminate  the  action.  This  process  is  modeled  through  the  use  of  the 
Markov  decision  process  analysis  to  determine  the  belief  vector  and  the  use  of  conditional  proba¬ 
bility  logic  to  determine  the  next  action  to  take,  based  on  an  evaluation  of  the  current  BV. 

2,1  Determination  of  the  Belief  Vector 

It  is  important  to  recognize  that  most  decisions  that  are  made  are  not  “one  off’  decisions  in  which 
the  decision  is  made  and  then  the  rewards  reaped  or  the  punishment  endured.  Instead,  most 
decisions  that  are  made  have  future  ramifications  and  affect  the  options  and  decisions  that  are 
available  later.  One  challenge  faced  by  any  decision  maker  is  the  uncertainty  that  s/he  has  about 
the  true  state  of  the  system.  In  most  circumstances,  the  true  state  of  the  system  is  unknown  or 
hidden.  That  is,  it  cannot  be  directly  observed.  For  example,  in  military  decisions,  often  there  is 
uncertainty  about  an  enemy’s  position,  strength,  and  morale.  Given  that  the  true  state  is  hidden, 
there  are  things  that  can  be  done  to  reduce  the  decision  maker’s  uncertainty  about  these  states.  For 
example,  the  decision  maker  may  try  to  detennine  the  enemy’s  position  by  sending  reconnaissance 
to  a  location  where  the  enemy  is  believed  to  be  located.  When  the  reconnaissance  returns  with 
either  an  “enemy  sighted”  or  “enemy  not  sighted”  report,  decision  makers  must  update  their  belief 
about  the  location  of  the  enemy. 

If  the  observations  and  actions  were  all  deterministic,  revising  a  belief  would  be  relatively  simple. 
However,  in  almost  all  conditions,  the  observations  and  actions  are  probabilistic.  That  is,  the 
probability  of  getting  an  observation,  given  the  true  state  of  the  environment,  is  not  necessarily 
0.0  or  1.0,  or  in  the  example  given,  there  is  a  certain  non-zero  probability  that  the  reconnaissance 
mission  was  sent  to  the  correct  location  and  will  miss  the  enemy  and  send  a  report  of  “enemy  not 
sighted”.  Furthermore,  there  may  be  a  non-zero  probability  that  the  reconnaissance  mission  falsely 
sent  a  report  of  “enemy  sighted”  (or  false  alarmed)  when  the  enemy  was  not  actually  at  the 
location. 

Given  that  the  observation  and  actions  are  probabilistic,  revising  a  belief,  given  an  observation  and 
an  action,  can  become  cognitively  difficult.  Furthennore,  evaluating  the  added  benefit  of  a  specific 
piece  of  equipment  that  changes  these  probabilities  can  also  become  difficult.  This  research  focuses 
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on  a  task  that  is  commonly  faced  by  decision  makers  in  the  military,  namely,  the  seek-and-destroy 
task.  In  this  task,  the  decision  maker  is  trying  to  localize  and  destroy  an  enemy  within  a  specific 
region.  At  the  decision  maker’s  disposal  are  actions  that  allow  information  to  be  gained  about  the 
true  state  of  the  system  (i.e.,  the  location  of  the  enemy)  in  addition  to  changing  the  state  of  the 
system  (e.g.,  moving  the  enemy  from  a  specific  location  to  the  state  of  destroyed).  The  former 
actions  are  reconnaissance  actions  and  the  latter  are  artillery  actions.  The  outcomes  of  these  actions 
are  probabilistic.  That  is,  reconnaissance  actions  will  not  always  detect  the  enemy  when  a  sensor  is 
sent  to  the  enemy’s  location.  Furthennore,  the  reconnaissance  may  also  falsely  report  that  the 
enemy  is  seen  at  a  location  in  which  the  enemy  is  not  located.  In  addition,  the  artillery  will  not 
always  kill  the  enemy  when  striking  it,  which  is  characterized  as  moving  the  enemy  from  being 
alive  at  a  certain  location  to  the  “destroyed”  state. 

2.1.1  The  Optimal  Observer 

To  best  evaluate  performance  in  a  task  that  leads  to  uncertainty  and  probabilistic  actions,  it  is 
useful  to  define  the  optimal  performance  within  the  task.  The  optimal  performance  can  be  calcu¬ 
lated  with  Bayesian  statistics.  However,  because  of  the  nature  of  the  current  type  of  task,  simple 
Bayesian  statistics  are  insufficient.  That  is,  with  simple  Bayesian  statistics,  the  likelihood  of  the 
true  state  of  the  system  can  be  optimally  estimated.  However,  this  likelihood  does  not  indicate 
what  action  should  be  selected.  In  order  to  select  action,  not  only  must  the  current  state  be  calcu¬ 
lated,  given  the  previous  actions  and  observations,  but  the  optimal  action  to  be  perfonned  in  a 
given  belief  state  must  also  be  calculated.  This  is  done  to  verify  whether  a  belief  state  has  a 
particular  probability  distribution  across  all  the  possible  states  in  the  environment. 

A  variation  of  classical  Bayesian  statistics  that  may  well  add  some  more  predictive  power  for 
sequential  decision  making  during  uncertainty  is  the  Partially  Observable  Markov  Decision 
Processes  (POMDP)  (Cassandra,  1998;  Cassandra,  Kaelbling,  &  Kurien,  1996;  Cassandra, 
Kaelbling,  &  Littman,  1994;  Kaelbling,  Littman,  &  Cassandra,  1998;  Legge,  Klitz,  &  Tjan,  1997). 
By  defining  the  State  Space,  Observation  Vector,  Transition  Matrix,  and  the  Reward  Structure,  we 
can  compute  the  expected  reward  for  a  particular  action.  In  the  following  sections,  a  description  of 
these  actions  is  provided.  In  addition,  a  description  of  how  to  optimally  update  an  individual’s 
belief  ( Belief  Updating),  given  these  definitions,  is  provided. 

An  ideal  observer  model  provides  optimal  perfonnance,  given  the  information  available  in  the 
task.  Typically,  ideal  observers  are  not  proposed  as  models  of  human  cognition.  Instead,  the  ideal 
observer  provides  a  benchmark  by  which  to  compare  human  performance.  More  specifically, 
these  models  illustrate  what  optimal  perfonnance  should  look  like.  When  human  perfonnance 
matches  that  of  the  ideal  observer  model,  it  can  be  concluded  that  the  human  is  effectively 
processing  all  of  the  information  in  the  task.  When  the  human  under-performs  the  ideal  observer, 
specific  discrepancies  between  the  human  data  and  the  ideal  data  may  identify  the  constraints 
imposed  by  the  human  information-processing  system. 
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Ideal  observer  analysis  is  not  new  to  this  research  and  has  been  previously  used  to  help  us 
understand  perceptual  functions  from  the  quantum  limits  of  light  detection  (Hecht,  Shlaer,  & 
Pirenne,  1942)  to  many  forms  of  visual  pattern  detection  and  discrimination  (Geisler,  1989)  to 
reading  Legge,  Hooven,  Klitz,  Mansfield,  &  Tjan,  2002;  Legge  et  al.,  1997),  object  recognition 
(Liu,  Knill,  &  Kersten,  1995;  Tjan,  Braje,  Legge,  &  Kersten,  1995;  Tjan  &  Legge,  1998),  eye 
movements  Najemnik  &  Geisler,  2005),  and  in  reaching  tasks  (Trommershauser,  Gepshtein, 
Maloney,  Landy,  &  Banks,  2005). 

2.1.2  Defining  the  State  Space1 

In  all  problems  that  are  solved  with  a  POMDP  architecture,  there  is  a  set  of  possible  states  that 
defines  the  state  space.  In  a  POMDP  problem,  the  true  state  (Statexme)  is  not  directly  observable 
(i.e.,  it  is  hidden).  For  the  work  in  this  project,  the  hidden  state  is  defined  as  the  enemy’s  current 
position  within  a  two-dimensional  grid.  This  grid  of  location  state  spaces  is  supplanted  by  an 
additional  “destroyed”  state  that  the  enemy  could  transition  into  following  an  action  to  destroy  it 
such  as  an  artillery  strike  at  its  current  position.  Thus,  the  dimensions  of  the  grid  can  be  charac¬ 
terized  as 

(X  x  Y)  +  z,  where  both  X  and  Y  >  1  and  z  =  1 . 

In  this  case,  X  is  the  number  of  locations  of  the  location  grid  in  the  X  dimension,  Y  is  the  number 
of  locations  of  the  location  grid  in  the  Y  dimension,  and  Z  is  the  dead  state  which  is  always  equal 
to  1.  With  this  nomenclature,  a  5x5  location  grid  state  space  yields  a  26-state  space.  A  4x4 
location  state  grid  gives  a  17-state  space,  a  3x3  location  state  gives  a  10-state  space,  a  2x2  location 
state  gives  a  5-state  space,  and  so  on.  These  different  state  space  dimensions  are  illustrated  in 
figures  1  through  3. 

2.1.3  Defining  the  Belief  Vector 

Although  the  true  state  is  hidden,  the  operator  typically  employs  actions  and  observations  that 
provide  infonnation  about  the  true  state  of  the  problem.  In  a  26-state  space  as  shown  in  figure  1, 
the  operator  can  fire  artillery  at  a  specific  position  or  conduct  reconnaissance  at  a  particular 
location  within  the  environment  (i.e.,  one  of  the  25  location  states).  In  this  model,  a  reconnais¬ 
sance  action  provides  two  possible  observations:  “ Enemy  Sighted”  or  “ Enemy  Not  Sighted”. 

An  artillery  strike  is  defined  to  only  provide  an  observation  of  “No  Information,  ”  meaning  that 
although  the  artillery  strike  was  conducted,  no  infonnation  is  provided  about  the  resulting  con¬ 
dition  of  the  enemy  resulting  from  it  because  only  a  reconnaissance  mission  can  observe  the 
condition  of  the  enemy.  This  replicates  the  fact  that  the  artillery  firing  unit  does  not  see  the  effects 
of  its  fires  because  it  is  an  indirect  firing  unit  and  is  not  able  to  see  where  the  artillery  rounds  fall. 


'Note  that  the  state  space  being  described  for  this  model  presentation  represents  a  very  primitive  enemy  state 
space;  only  one  enemy  exists  in  this  state  space  and  the  enemy  can  transition  only  to  the  dead  state  or  stay  where  it 
is.  Clearly,  a  state  space  of  a  likely  enemy  in  a  current  real-world  battle  space  will  have  multiple  target  types  (and 
number)  as  well  as  associated  probabilities  of  transitioning  to  the  dead  state.  Probabilities  will  depend  on  the  type  of 
target  as  well  as  the  method  of  engagement. 
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It  must  rely  on  forward  observer  (FO)  assets  to  report  what  is  tenned  “battle  damage  assessment” 
(BDA)  in  military  jargon.  An  illustration  of  a  reconnaissance  asset  might  be  an  FO  on  the  ground 
or  a  UAV  that  provides  the  BDA. 
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Figure  1.  5x5  location  state  space. 


Figure  2.  2x2  location  state  space. 
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Figure  3.  2x1  location  state  space. 


This  model  assumes  that  the  observer  has  a  belief  probability  between  0%  and  100%  that  the 
enemy  exists  in  one  of  the  states  within  the  total  state  space  at  any  given  period  in  time.  It  is  noted 
that  residing  in  one  of  the  location  states  is  mutually  exclusive  from  residing  in  the  destroyed  state. 
That  is,  if  the  enemy  is  “alive”  in  one  of  the  location  states,  it  cannot  be  “dead”  in  the  destroyed 
state  and  vice  versa.  The  destroyed  state  is  considered  to  be  an  “absorbing  state”  in  that  once  the 
enemy  transitions  from  being  alive  in  a  location  state  to  being  dead  in  the  destroyed  state,  it  cannot 
return  to  a  live  location  state.  The  set  of  the  belief  probabilities  for  all  the  states  in  the  state  space 
is  termed  the  BV.  For  the  simple  three-state  space  example  in  figure  3,  the  BV  could  be  repre¬ 
sented  as 

[BLocation  1  ?  B[  oca^jon  2,  B  Destroyed] 


For  this  case,  assume  that  the  enemy  is  alive  with  an  equal  probability  of  residing  in  one  of  the 
location  states.  Thus,  the  BV  becomes 

[0.5,  0.5,  0.0]. 


2.1.4  Defining  the  Transition  Matrix 

In  this  seek-and-destroy  problem,  the  observer  (in  this  case,  the  military  commander)  has  a  number 
of  possible  actions  available.  For  actions  in  a  5x5  location  grid  state  space,  there  are  25  possible 
reconnaissance  actions  (one  for  each  of  the  25  locations  in  the  environment),  25  possible  artillery 
actions  (again,  one  for  each  of  the  25  locations  within  the  environment),  and  the  action  to  declare 
“mission  complete”  when  it  is  believed  that  the  enemy  has  been  destroyed,  making  a  total  of  5 1 
possible  actions.  The  transition  matrix  defines  the  probability  of  the  resulting  state,  given  that  the 
observer  executes  a  particular  action  in  a  specified  state  (i.e.,/?(s'|s,a))  where  5’  is  the  resulting 
state,  s  is  the  existing  state  the  observer  chooses  to  act  on  (by  reconnaissance  or  artillery),  and  a  is 
the  action  taken.  In  the  static  form  of  the  seek-and-destroy  problem  (i.e.,  where  the  enemy  is  not 
moving),  there  is  only  one  state  transition  that  can  occur  for  any  of  the  possible  actions. 

When  the  commander  fires  artillery  to  where  the  enemy  is  located,  the  enemy  may  be  killed  with  a 
certain  probability,  which  will  cause  it  to  transition  into  the  “destroyed”  state.  Sample  probability 
estimations  for  use  in  this  discussion  are  shown  in  tables  1  through  3.  These  values  are  estimates 
only  and  are  not  to  be  construed  as  factual.  The  probabilities  used  in  actual  analyses  depend  on  the 
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scenario  conditions  at  the  time  of  the  actual  action  sequence  and  are  left  as  model  input  parameters 
to  be  employed  during  the  course  of  simulation  studies  that  use  this  model. 


Table  1 .  The  set  of  actions  and  their  observations  for  the  current  seek  and  destroy  task. 


Action 

Observation 

Condition 

Probability 

Recon 

Enemy  Sighted 

Enemy  Present 

0.75 

Recon 

Enemy  Not  Sighted 

Enemy  Present 

0.25 

Recon 

Enemy  Sighted 

Enemy  Not  Present 

0.2 

Recon 

Enemy  Not  Sighted 

Enemy  Not  Present 

0.8 

The  observations  for  the  reconnaissance  action  depend  on  whether  the  enemy  is  actually  within  the  viewing  region  of  the 


reconnaissance.  Thus,  the  two  possible  states  are  “enemy  present”  and  “enemy  not  present”. 


Table  2.  Probabilities  for  observation  from  artillery  strike. 


Action 

Observation 

Condition 

Probability 

Strike 

Noinfo 

Enemy  Present 

1.0 

Strike 

Noinfo 

Enemy  Not  Present 

1.0 

Table  3.  Probabilities  for  killing  the  enemy  from  artillery  strike. 


Action 

Result 

Condition 

Probability  of  Dead 

Strike 

Probability  of  Enemy  being  killed. 

Enemy  Present 

0.75 

Strike 

Probability  of  Enemy  not  being  killed. 

Enemy  Present 

0.25 

Strike 

Probability  of  Enemy  being  killed. 

Enemy  Not  Present 

0.0 

2.1.5  Updating  the  Belief  Vector  (BV) 


Given  an  initial  probability  distribution  over  the  state  space,  the  observation  matrix,  and  the 
transition  matrix,  hypotheses  can  be  generated  about  the  current  state  of  the  problem  after  an  action 
and  the  returned  observation.  The  general  fonn  of  Bayes’  rule  (Trueman,  1977;  Walpole,  Myers,  & 
Myers,  1998;  Wine,  1964),  as  shown  in  equation  1,  is  used  as  a  basis  to  develop  this  relationship. 


P(B.|A) 


P(B,nA)  P(B,)P(A|B,) 

EM,P(BiriA)“L=uP(Bi)p(AlB.) 


for  r  =  1,2,  ...  k 


in  which 


Equation  1  -  General  Form  of  Bayes’  Rule 


P 

- 

probability 

A 

- 

StateA 

B 

- 

States 

| 

- 

“so  that”  or  “given” 

e 

- 

Probability  theory  -  all  state  spaces 

n 

- 

Boolean  AND 

u 

Boolean  OR 
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2 . 1 . 5 . 1  Bayesian  Updating  Rule 


Using  Bayes’  rule,  POMDP  expressions  are  derived  to  simulate  sequential  decision  making  with 
uncertainty.  We  compute  an  updated  BV  by  performing  a  particular  action  to  account  for  the 
current  condition  of  the  state  space,  a  transition  matrix  for  moving  from  one  state  to  the  next,  the 
application  of  a  BV  generated  by  the  results  of  past  actions,  and  an  observation  vector  of  past 
information  elements  obtained  from  previous  observations  of  the  state  space.  This  Bayesian 
updating  rule  is  expressed  as 


p(s'  b,o,a) 


p(o|s',b,a)p(s'|b,a) 

p(o|b,a) 


in  which 

s’ 

b 

o 

a 


Equation  2  -  Bayesian  Updating  Rule 

true  state  (of  the  condition  being  present  within  the  total  of  all 
states,  S),  represented  as  s’  e  S 
prior  belief  vector 
observation 

action  that  was  generated 


nomenclature.  The  tenn  p(s’|b,o,a),  is  read  as 

The  probability  of  s”  being  true,  “so  that”  or  “given”  the  Boolean  conditions  of 


“b”  AND  “o”  AND  “a”. 


Equation  2  specifies  how  the  ideal  observer  would  update  the  belief  that  s’  is  the  true  state,  given 
the  prior  belief  ( b ),  the  observation  (o),  and  the  action  that  was  generated  (a). 

2. 1 .5.2  Update  the  Belief  Vector  for  First  Action:  Perform  Reconi  at  Statei 

To  illustrate  the  process  of  belief  updating,  the  simple  2x1  location  state  space  of  figure  3  will  be 
used.  Here,  the  enemy  will  be  associated  with  one  of  three  states:  Statei,  State 2,  or  Stateoead ■  For 
this  case,  assume  that  the  enemy  is  alive  with  an  equal  probability  of  residing  in  one  of  the  location 
states.  Thus,  the  BV  becomes 

[0.5,  0.5,  0.0] 

meaning  that  there  is  a  50%  probability  of  the  enemy  being  believed  to  be  in  Statei,  a  50% 
probability  of  the  enemy  being  believed  to  be  in  State 2,  and  a  0%  probability  of  the  enemy  being 
believed  to  be  in  Stateoead ■  Assume  that  the  enemy  actually  is  located  in  Statei  and  that  the 
observer  decides  to  do  a  reconnaissance  of  Statei  and  receives  an  “enemy  sighted”  observation. 

The  first  task  is  to  detennine  what  the  observer’s  belief  is  resulting  from  this  action  for  the  enemy 
being  located  in  Statei,  State 2,  and  State Dead- 

With  equation  2,  the  belief  likelihood  that  the  enemy  is  in  Statei  is  computed.  That  is,  the  desire  is 
to  compute  the  belief  probability  that  the  enemy  is  in  Statei,  given  the  current  BV,  the  current 
observation,  and  the  current  action,  or p(Statei\  [0.5,  0.5,  0.0],  “Enemy Sighted”, Recon  1). 
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Computing  the  separate  components  of  equation  2,  first,  compute  p(o\s  ’ ,b,a )  or  /;(“Encmy 
Sighted”) State i,  [0.5,  0.5,  0.0], Reconi).  To  do  this,  the  likelihood  of  obtaining  an  observation  of 
“Enemy  Sighted”  if  State]  were  the  true  state  is  needed.  From  table  1,  the  likelihood  of  correctly 
identifying  the  enemy  as  0.75  is  selected. 

Next,  compute  p(s  ’\b,a)  or  the  likelihood  of  the  true  state  being  State],  given  the  previous  belief 
and  the  action  of  Recon].  Because  there  is  no  transition  possible  to  Stafezwfrom  a  recon  mission, 
these  remain  at  the  previous  probabilities  of  0.5. 

Finally,  compute  p(o\b,a)  the  likelihood  of  receiving  the  observation  “EnemySighted”  when  recon¬ 
naissance  is  made  at  State]  or /?(‘EnemySighted’|[0.5,0.5,  0], Reconi). 

These  calculations  are 

p  (o|s’,b,a) ,  for  Statei  =  p(‘Enemy  Sighted’  |  True  State  Belief,  [0.5,  0.5,  0.0],  Reconi) 

=  Probability  of  Enemy  sighted,  given  belief  that  enemy  was  at  Statei 
and  Reconi  was  performed  at  Statei  = 

=  0.75, 

from  table  1 

p(s’|b,a) ,  for  Statei  =  p(  Statei  is  true  state  |  ([0.5,  0.5,  0.0],  Reconi)) 

=  Probability  of  Statei  being  the  true  state,  given  belief  that  Statei 
is  true  state  and  Reconi  showed  enemy  present  in  Statei  =  s’  = 

=  0.5, 

from  assumption  of  equal  probability  that  the  enemy  has  an 
initial  probability  of  being  at  one  of  the  two  location  states. 

p(o|b,a)  ,  for  Staf  =  p(‘Enemy  Sighted’  |  ([0.5,  0.5,  0.0],  Reconi)) 

=  (  Probability  of  Enemy  in  Statei  x  Probability  of  Enemy  Sighted 
When  Present)  +  (  Probability  of  Enemy  in  State2  x  Probability  of 
Enemy  Sighted  When  Not  Present)  +  (Probability  of  Enemy  Being 
Dead  x  Probability  of  Enemy  Sighted  When  Not  Present) 

=  0.5x0.75  +  0.5x0. 2  +  0.0x0.2  =  .375  +  0.1  =  0.475 

Thus,  p(Statei  \  [0.5,  0.5,  0.0],  “EnemySighted,  ”  Recon  [)  = 

p(s’|b,o,a)  =  p  (o|s’,b,a)  p(s’|b,a)  =  0.75  x  0.5  =  0.7895 

p(o|b,a)  0.475 

Likewise,  p(State2  \  [0.5,  0.5,  0.0],  “EnemySighted,” Recon i)  = 

0.2  x  0.5  =  0.2105 

0.475 

and  p(Stateoead  \  [0.5,  0.5,  0.0],  “EnemySighted,  ”  Recon i)  - 

0.2  x  0.0=  (TO 
0.475 
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Thus,  if  the  first  action  is  to  observe,  i.e.,  perform  a  UAV  reconnaissance  mission,  at  State],  the 
new  belief  vector  would  be 


[0.7895,0.2105,0.0]. 

The  interpretation  of  this  BV  is  that  the  enemy  has  a  0.7895  probability  of  being  believed  to  be  at 
Statei  (present  in  Celfi),  a  0.2105  probability  of  being  believed  to  be  at  State2  (present  in  CelE), 
and  a  0.0000  probability  of  being  believed  to  be  at  State3  (dead).  Since  these  probabilities  must 
account  for  the  total  belief  state  of  the  operator,  they  must  therefore  sum  to  1.0.  Performing  this 
check  sum, 

CS  =  0.7895  +  0.2105  +  0.0000  =  1.0000;  therefore,  Checksum  verification  passed. 

2. 1.5.3  Update  the  Belief  Vector  for  Second  Action:  Perform  Strikei  at  Statei 

Now  assume  that  the  second  action  is  to  conduct  an  artillery  strike  at  Statei  which  is  represented  as 
Strike].  In  order  to  update  the  BV  with  the  belief  that  the  enemy  is  in  Statei  as  a  result  of  this  new 
action,  detennine  the  probability  that  the  enemy  is  at  Statei,  given  the  BV  from  the  first  action 
(Reconi),  [0.7895,  0.2105,  0.0],  and  the  new  action,  Strikei,  recognizing  that  the  only  observation 
from  an  artillery  strike  is  that  the  strike  was  fired  which  provides  the  observation  “Noinfo”.  Thus, 
the  new  probability 

/7(Statei |[0.7895,  0.2105,  0.0],  “Noinfo,”  Strikei) 

is  computed.  Calculating  the  components  for  the  updated  BV  component  for  Statei  from  equation  2, 

p  (o|s’,b,a)  =p  (‘Noinfo’  |  True  State  Belief,  [0.7895,  0.2105,  0.0],  Strikei)  = 

=  Probability  of  ‘Noinfo’  given  current  BV  and  Strikei  was  perfonned 
at  Statei  = 

^L0, 

because  an  artillery  strike  will  always  return  a  report  of  “Noinfo” 
simply  meaning  that  the  artillery  strike  was  fired  with  no  other 
information  provided. 

p(s1b,a)  =p(  Statei  |  [0.7895,  0.2105,  0.0],  Strikei) 

=  Probability  of  “Enemy  Not  Dead”  being  the  true  state,  given  the 
current  BV  and  the  action  of  Strikei  being  fired. 

From  table  3,  the  probability  of  the  enemy  being  transitioned  to  dead  if 
artillery  is  fired  at  the  location  containing  the  enemy,  or  in  this 
case,  Strikei  being  to  Statei  is  =  0.75. 

Conversely,  if  the  artillery  strike,  Strikei,  into  Statei  does  not  kill  the 
enemy  with  the  enemy  remaining  in  a  state  of  “Enemy  Not  Dead,” 
from  table  3  the  probability  becomes  (because  of  the  three  states  in 
the  state  space  which  are  Statei  (present  or  not  present  in  cell  1), 

State2  (present  or  not  present  in  cell  2),  and  State3  (belief  that  the 
enemy  is  dead  or  not  dead),  the  deduction  is  that  the  enemy  is 
believed  to  be  alive  (i.e.,  not  dead)  in  States)  =  0.25. 
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Thus,  the  probability  that  the  enemy’s  state  will  not  change  or  that  the 
enemy  will  remain  alive  in  Statei  is  equal  to  0.25  times  the 
probability  that  the  previous  Reconi  sighted  the  enemy  in  Statei,  or 
0.7895  fromp(Statei|  [0.5, 0.5, 0.0],  “Enemy  Sighted,”  Reconi). 
Therefore,  p(s’|b,a)  ,  for  Statei,  following  Strikei  is 
p(s’|b,a)  =0.7895  x0.25 
=  0.1974 

p(o|b,a)  =  p((‘NoInfo’  |  ([0.7895,  0.2105,  0.0],  Strikei)) 

=  Probability  of  ‘Noinfo’  given  current  BV  and  Strikei  being  performed 
at  Statei  = 

=  1.0,  because  an  artillery  strike  will  always  return  a  report  of  “Noinfo” 
simply  meaning  that  the  artillery  strike  was  fired  with  no  other 
information  provided. 


Employing  equation  2  to  determine  the  BV, 


p(s’|b,o,a) 


=  p  (ols’.b.a)  p(s’lb.a)  =  1,0000x0.1974=  0.1974 

p(o|b,a)  1.0 


Thus,  p(Statei 
Likewise,  p(State2 
Likewise,  p(Stater>ead 


[0.7895,  0.2105,  0.0], 
[0.7895,  0.2105,  0.0], 
[0.7895,  0.2105,  0.0], 


“Noinfo  ”,  Strikei) 
“Nolnfo  ”,  Strikei) 
“Noinfo  ”, Strikei) 


0.1974 

0.2105 

0.5921 


Thus,  after  the  second  iteration,  where  the  action  was  to  fire  artillery  strike  1  into  Cell  i  (Statei), 
called  Strikei, 


the  BV  now  becomes  =  [0.1974,  0.2105,  0.5921] 

which  is  interpreted  to  mean  a  0.1974  probability  of  the  enemy  being  believed  to  be  alive  in  Statei 
(Celfi),  a  0.2105  probability  of  the  enemy  being  believed  to  be  alive  in  Statei  (Cello),  and  a  0.5921 
probability  of  the  enemy  being  believed  to  be  dead  or  in  Stateoead-  Performing  the  check  sum 
verification, 

CS  =  0.1974  +  0.2105  +  0.5921  =  1.0000;  therefore,  Checksum  verification  passed. 

See  appendix  A  for  a  complete  set  of  BV  sample  calculations  for  all  components  of  the  state  space 
for  a  selected  set  of  five  action  sequences. 

2.2  Determination  of  the  Action  Sequence 

The  determination  of  which  action  to  take  at  each  iteration  of  the  model  is  made  in  order  to  create 
the  statistical  optimal  end  state  effect.  This  state  is  defined  as  the  end  state  reached  by  the  fewest 
action  sequences  with  the  highest  optimal  reward  value  possible.  Reward  structure  is  discussed 
later  in  this  report.  Thus,  the  selection  of  the  action  sequence  is  achieved  through  a  deterministic 
evaluation  of  the  previous  Bayesian  state  space  and  BV  with  the  use  of  conditional  probability 
logic.  To  begin  this  discussion,  the  following  definitions  are  made: 
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A  =  (Delta)  Declare  Threshold;  if  the  Belief  Probability  of  Enemy  Destroyed  >  A,  then 

Declare;  A  only  refers  to  the  DEAD  state. 

a  =  (Sigma)  Shoot  Threshold;  if  the  Belief  Probability  of  Enemy  Destroyed  >  a,  then 

Shoot;  otherwise,  perform  Recon;  a  only  refers  to  the  LOCATION  states. 

Contrast  =  The  probability  of  the  enemy  being  in  one  state  relative  to  all  other  (location) 

states  (dead  state  is  therefore  excluded  from  the  Contrast  determination).  This  is  a 
calculated  value  referred  to  as  a  Conditional  Probability  (CP). 

Thus,  if  the  belief  probability  of  the  enemy  being  destroyed  >  A,  then  the  action  decision  will  be 
to  declare  the  mission  complete  and  end  it.  However,  if  A  <  0.9,  then  go  through  Recon  versus 
Shoot  logic  according  to  the  a  threshold.  Note  that  the  A  threshold  only  applies  to  States,  the  dead 
state,  and  the  a  threshold  only  applies  to  the  location  states,  Statei  and  States.  Therefore,  the  A  and 
a  thresholds  are  independent  and  not  directly  related.  For  the  purposes  of  discussion,  the  follow¬ 
ing  assignments  are  made.  These  are  for  reference  only  as  were  previous  assignments  in  tables  1 
through  3  and  are  left  as  input  parameters  to  the  model  to  set  its  level  of  tolerance  and  to  reflect  the 
scenario  conditions  in  effect  at  the  time  of  the  model  invocation. 

Assume  the  following  assignments: 

A  =  0.9. 

a  =  0.75. 

2.2.1  Conditional  Probability  Logic 

For  the  case  undergoing  investigation,  the  action  decision  logic  can  be  viewed  as  a  CP.  The 
probability  of  the  enemy  being  in  one  of  the  two  location  states  (Statei  or  State2),  given  that  the 
enemy  is  not  destroyed,  i.e.,  not  present  in  State3,  can  be  represented  as 

P  (Si  |  Enemy  Not  Destroyed)  =  P(Si  |  ! S3),  where  the  !  symbol  represents  Boolean  ‘NOT’ 

Employing  the  form  of  Bayes’  Theorem  (equation  3),  this  becomes 

P(Si|!S3)=  P(! S3  I  Si)  P(Si) 

P(!S3) 

Equation  3  -  Conditional  Probability  Initial  Form 

Since  the  probability  of  the  enemy  not  being  dead  if  in  Statei  is  equal  to  1.0,  meaning  that  the 
enemy  is  alive,  is  represented  as 

P(! S3  |  Si)  =  1.0, 

and  the  probability  of  the  enemy  not  being  dead,  P(!  S3)  is  equal  to  the  sum  of  the  probabilities  of 
being  in  one  of  the  location  states,  or  in  this  case  [P(Si)  +  P(S2)],  equation  3  now  becomes 

P(Si|!S3)=  I.OxP(Si) _  =  P(Si) _ 

P(Si)+  P(S2)  P(S0+  p(s2) 

Equation  4  -  Conditional  Probability  Expression 
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2.2.2  Calculation  of  Sample  Action  Sequence  Using  Conditional  Probability 

By  evaluating  the  CP  using  the  a  and  A  thresholds,  the  model  makes  the  action  decisions  during 
each  iteration  of  the  logic.  After  each  action,  a  new  BV  is  calculated  to  be  used  in  the  next  action 
decision.  For  the  sample  a  and  A  threshold  values  of  0.75  and  0.90,  the  action  sequence  for  the 
first  five  actions  is  computed  for  verification  of  the  computer  simulation  runs  that  will  be  made 
with  this  model.  These  parameter  choices  provide  for  an  action  sequence  for  the  first  five  actions 
to  be 

1)  Reconi  to  Si. 

2)  Strikei  to  Si. 

3)  Recon2  to  S2. 

4)  Strike2toSi. 

5)  Recon3  to  S2. 

The  BV  and  action  values  manually  calculated  for  this  five-action  sequence  will  be  used  as  test 
parameters  to  develop  a  computer  simulation  modeling  this  process. 

See  appendix  B  for  tables  that  illustrate  the  action  calculations  that  detennine  this  sequence. 

2.3  Implementation  in  C3TRACE 

To  implement  this  model  in  a  computer  simulation,  the  programming  environment  of  command, 
control,  and  communications:  techniques  for  the  reliable  assessment  of  concept  execution 
(C3TRACE)  (Kilduff,  Swoboda,  &  Barnette,  2005;  Plott,  2002;  Plott,  Quesada,  Kilduff,  Swoboda, 

&  Allender,  2004)  is  employed.  C3TRACE,  developed  through  funding  by  ARL’s  Human  Research 
and  Engineering  Directorate,  is  an  adaptation  of  the  commercial  discrete  event  programming 
language  Micro  Saint  Sharp2  (Schunk  &  Plott,  2004).  Although  the  basic  Micro  Saint  Sharp 
programming  language  allows  task-based  computer  simulations  of  real-world  systems  and  processes 
to  be  represented,  C3TRACE  has  embedded  data  structures  that  augment  Micro  Saint  Sharp  to  allow 
for  detailed  representation  of  U.S.  Anny  C2  systems. 

The  optimal  decision-making  model  described  in  this  report  allows  existing  computer  simulations 
of  C2  systems  configured  around  task  performance  analysis  (Cassandra  et  al.,  1996;  Hancock  & 
Meshkati,  1988;  Middlebrooks,  2003,  2004;  Middlebrooks  et  al.,  1999a,  1999b;  Middlebrooks  & 
Williges,  2002)  to  now  be  structured  to  incorporate  optimal  decision  making  as  a  performance 
metric  with  the  use  of  the  belief  updating  model.  The  steps  in  this  process  resemble  the  well-known 
observe-orient-decide-act  (OODA)  model  (Belknap,  1996;  Boyd,  1982;  Morgan,  Glickman,  Wood¬ 
ward,  Blaiwes,  &  Salas,  1986).  The  decision  actions  in  this  model  consist  of  gathering  information, 
updating  the  belief  about  the  environment  or  state  space,  taking  an  action  to  accomplish  an  objective 
in  the  state  space,  and  then  making  a  decision  whether  to  continue  the  mission  or  terminate  it  with 
an  assessment  of  mission  success  or  failure.  An  example  in  a  military  C2  scenario  employs  a  UAV 
to  gather  the  intelligence,  artillery  to  take  an  action  to  destroy  an  enemy  somewhere  within  the  state 
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space,  and  belief  updating  to  evaluate  the  situation  after  each  action  and  then  repeat  the  sequence  or 
declare  “mission  complete”. 

2.3.1  Design  of  the  C3TRACE  Simulation 

C3  TRACE  programs  are  implemented  with  discrete  event  language  constructs  common  to  any 
Micro  Saint  Sharp  simulation  program.  The  top  level  of  a  C2  sub-workgroup  within  a  sample 
organization  is  shown  in  the  example  depicted  in  figure  4.  Here,  messages  received  by  the  radio 
operator  are  distributed  according  to  their  subject  content.  Situation  reports  (SITREPs)  are  passed 
to  the  S3  operations  officer,  logistics  reports  are  passed  to  the  S4  logistics  officer  for  action,  and  so 
on.  If,  for  example,  a  mission  directive  such  as  seek  and  destroy  an  enemy  is  received,  it  is  passed 
to  the  commander  for  action.  There  are  different  reactions  that  might  be  invoked  in  response  to 
such  a  directive.  The  commander  might  communicate  to  the  originating  authority  to  clarify  infor¬ 
mation,  an  initial  estimate  of  the  situation  before  taking  action  might  be  performed,  an  updating  of 
the  situational  awareness  before  taking  action  might  be  perfonned,  or  the  mission  might  be  under¬ 
taken  as  directed.  In  this  case,  as  depicted  in  the  green  box  in  figure  4,  what  is  referred  to  as  the 
decision  making  during  uncertainty  (DMDUC)  process  would  be  initiated  to  execute  the  mission. 


Figure  4.  C3TRACE  command  and  control  simulation  vignette. 

Figure  5  illustrates  the  optimal  decision  process  that  is  modeled.  As  stated,  this  process  is  very 
similar  to  the  OODA  model.  This  diagram  represents  an  iterative  process  where  the  decision  maker 
makes  an  initial  estimate  of  the  situation  and  then  begins  an  iterative  process  of  gathering  additional 
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information  (flying  a  UAV  mission)  or  taking  an  action  to  destroy  the  enemy  (firing  artillery). 
When  the  commander  believes  that  the  enemy  has  been  destroyed,  a  mission  complete  decision  is 
made  and  the  results  of  the  decision  are  realized.  If  the  enemy  was  destroyed  and  the  decision 
maker  made  that  correct  assessment,  then  a  positive  reward  resulting  from  a  good  decision  is 
applied  to  the  performance  of  the  overall  system.  If  the  enemy  was  not  destroyed  and  the  decision 
maker  believed  that  it  was  destroyed,  then  a  negative  battlefield  outcome  is  applied  to  the  simula¬ 
tion.  Likewise,  if  the  enemy  was  destroyed  but  the  decision  maker  believed  it  was  not,  then  the 
results  of  poor  decision  making  are  applied.  This  process  of  iterative  action  can  be  generalized  to 
similar  scenarios  where  information  is  gathered  (observe),  belief  updating  occurs  (orient),  decisions 
are  made  for  mission  success  (Decide),  and  actions  are  taken  to  accomplish  the  mission  (act).  The 
examples  of  employing  a  UAV  and  firing  artillery  are  used  here  to  provide  a  tangible  example  of 
how  this  type  of  activity  might  occur. 

Referring  to  figure  5,  the  top-level  logic  for  this  model  can  be  examined.  After  initiating  the 
decision  sequence  and  perfonning  an  initial  estimate  of  the  situation,  the  commander  updates  the 
BV,  defined  as  the  belief  about  the  current  situation  regarding  the  enemy,  and  then  begins  an 
iterative  process  of  looking  for  information  or  taking  an  action  to  accomplish  the  mission.  When 
this  process  has  reached  some  level  of  belief  that  the  mission  is  accomplished,  the  commander 
tenninates  the  action  and  completes  the  decision  process  by  declaring  that  the  mission  is  a 
“success”  or  a  “failure”. 

If  the  initial  desire  is  to  obtain  additional  infonnation,  a  UAV  is  sent  to  a  specified  location  to  try  to 
locate  the  enemy.  The  UAV  is  the  information-gathering  or  BDA  tool  available  to  the  commander 
to  update  the  BV  about  the  enemy.  If  the  target  is  already  dead  from  previous  artillery  action,  then 
there  is  no  correct  location  for  the  enemy  because  it  does  not  exist.  If  the  enemy  is  alive  and  the 
UAV  is  sent  to  the  correct  location,  then  it  has  a  probability,  according  to  table  1,  of  detecting 
or  not  detecting  the  enemy  according  to  the  accuracy  of  the  UAV.  From  this  it  will  correctly  or 
incorrectly  report  that  the  enemy  was  found.  Likewise,  if  it  is  sent  to  a  location  where  the  enemy  is 
not  located  or  if  the  enemy  is  already  dead,  it  may  correctly  or  incorrectly  report  the  enemy  sighted, 
again  according  to  table  1 .  The  values  in  table  1  are  only  sample  estimates  for  use  in  the  develop¬ 
ment  of  this  model  and  do  not  represent  any  actual  system  currently  in  existence.  During  the  actual 
use  of  this  model,  these  parameters  are  set  to  represent  the  actual  detection  characteristics  of  the 
information  gathering  entity  being  evaluated.  After  the  UAV  mission  is  flown,  the  commander 
evaluates  the  report  from  the  UAV  through  the  process  of  updating  the  BV  and  using  this  new 
information,  decides  what  process  to  invoke  next. 
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Figure  5.  Optimal  decision  making  during  uncertainty  model. 


If  the  commander  decides  to  fire  artillery  (which  is  representative  of  taking  a  positive  action  to  do 
something  to  accomplish  the  mission),  then  the  probability  exists  that  the  right  or  wrong  location 
will  be  fired  upon.  If  the  artillery  fires  on  the  wrong  location,  then  the  only  outcome  will  be  to 
miss  the  target.  If  the  correct  location  is  fired  upon,  then  the  artillery  will  kill  or  not  kill  the  enemy 
according  to  the  circular  area  of  probability  for  the  type  of  artillery  fired.  Independent  of  where 
the  artillery  is  fired,  the  only  report  that  is  sent  to  the  commander  is  that  the  artillery  fired  upon  the 
location  specified  or  “no  information”  concerning  BDA.  This  represents  the  fact  that  artillery  is  an 
indirect  fire  weapon  and  the  firing  unit  never  actually  sees  the  target.  The  forward  observer,  or  in 
this  case,  the  UAV,  must  report  the  actual  target  situation,  i.e.,  to  provide  the  BDA.  The  com¬ 
mander  must  than  evaluate  the  firing  data  and  information  from  previous  UAV  reconnaissance 
missions  to  decide  whether  to  continue  the  mission  or  declare  the  enemy  is  dead  and  end  the 
mission. 

When  the  commander  believes  that  the  enemy  has  been  destroyed,  then  mission  complete  is 
declared.  Then  the  commander  is  faced  with  the  rewards  of  a  successful,  i.e.,  good  decision 
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sequence  where  the  enemy  was  killed,  meaning  that  the  mission  accomplished,  or  the  effects  of  a 
bad  decision  where  the  enemy  was  not  killed,  meaning  that  the  mission  was  not  accomplished. 

2.3.2  Belief  Updating  Logic 


Figure  6  illustrates  the  input  feeding  the  sequence  of  evaluating  the  current  situation  and  updating 
of  the  BY  and  the  resulting  choice  for  the  next  action. 
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Figure  6.  Task  diagram  for  belief  updating. 


The  code  script  in  the  beginning  effect  of  C3TRACE  task  “Update  Belief  Vector”  of  figure  6 
closely  follows  the  BV  updating  logic  described  before.  An  annotated  description  of  this  logic  as 
implemented  in  the  C3TRACE  computer  simulation  takes  the  form 


Definitions: 

BV  -  Belief  Vector 

Variables  in  Equation  2: 

s’  -  True  state  within  the  total  state  space, 

b  -  Prior  belief  for  that  state, 

o  -  Current  observation, 

a  -  Action  that  was  generated. 

In-State  -  Probability  of  a  state  being  transferred  in  to.  This  is  equal  to  0  for  location  states  because 

for  a  static  enemy  condition  a  location  state  can  never  be  transferred  in  to. 

Out-State  -  Probability  of  a  state  being  transferred  out  from.  This  is  equal  to  0  for  the  dead  state 

because  once  the  enemy  is  dead  it  can  not  be  transferred  back  to  alive. 

Components  of  Equation  2: 

PoGs’ba  -  Probability  of  o  Given  s’  &  b  &  a. 
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Ps’Gba  -  Probability  of  s’  Given  b  &  a. 

PoGba  -  Probability  of  o  Given  b  &  a. 

Ps’Gboa  -  Probability  of  s’  Given  b  &  o  &  a  =  (PoGs’ba  *  Ps’Gba)  /  PoGba  =>  Eq. 

2. 

If  the  action  generated  was  to  recon,  then  update  the  BV  from  the  recon  mission: 

Do  for  each  state: 

Calculate  PoGs’ba  from  Table  1  lookup. 

Calculate  Ps’Gba  from  previous  belief  probability  of  the  enemy  in  the  state  occupied. 

Calculate  PoGba  from: 

(Probability  of  Enemy  in  State  Action  To  *  Table  1  Lookup  for  that  State)  + 

(Probability  of  Enemy  in  State  Action  Not  To  *  Table  1  Lookup  for  that  State)  + 

(Probability  of  Enemy  in  State  Dead  *  Table  1  Lookup  for  condition  applicable  to  State  Dead) 
Calculate  Ps’Gboa  from:  (PoGs’ba  *  Ps’Gba)  /  PoGba 
Equate  the  BV  component  for  that  state  =  Ps’Gboa. 

End  Do 

Else  if  the  action  generated  was  to  Shoot,  then  update  BV  from  the  shoot  mission: 

Do  for  each  state: 

Calculate  PoGs’ba  from  Table  2  lookup  which  will  always  =  1.0. 

Calculate  Ps’Gba  from: 

[Probability  (In-State)  x  (previous  probability  for  State  Struck]  + 

[(1.0-  Probability  (Out-State)  x  (previous  probability  for  State  Occupied)] 

Calculate  PoGba  from  Table  2  lookup  which  will  always  =  1.0. 

Calculate  Ps’Gboa  from:  (PoGs’ba  *  Ps’Gba)  /  PoGba 
Equate  the  BV  component  for  that  state  =  Ps’Gboa. 

End  Do 


End 

See  figures  12  through  16  for  sample  calculations  performed  to  test  this  logic. 

2.3.3  Action  Decision-Making  Logic 

The  code  script  in  the  beginning  effect  of  C3TRACE  task  “Evaluate  Report”  of  figure  6  closely 
follows  the  CP  logic  described  previously.  An  annotated  description  of  this  logic  as  implemented 
in  the  C3  TRACE  computer  simulation  takes  the  fonn 

Definitions: 

CP  -  Conditional  Probability 

Sigma  -  c,  Shoot  Threshold;  if  the  Belief  Probability  of  Enemy  Destroyed  >  c,  then  Shoot, 

otherwise  recon.  o  only  refers  to  the  LOCATION  states. 

Delta  -  A,  Declare  Threshold;  if  the  Belief  Probability  of  Enemy  Destroyed  >  A,  then  Declare.  A 

only  refers  to  the  DEAD  state. 

Variables  in  Equation  6: 

PSi  -  Previous  BV  component  for  Si. 

PS2  -  Previous  BV  component  for  S2. 

PsiGNs3  -Probability  of  Sj  given  NOT  S3  =  probability  of  the  enemy  not  being  dead 

PS,  /(PSj  +PS2)  =>  Eq.  6. 

Do 

Calculate  CP!  =  PSi  /  (PS!  +  PS2  ) 

Calculate  CP2  =  PS2  /(PS!  +  PS2 ) 

End  Do 

If  ((CP[  &  CP2)  <  o)  then 
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If  (CPi  >  CP2)  then  Recon  at  Si 
If  (CPi  <  CP2)  then  Recon  at  S2 
If  (CPi  =  CP2)  then  Recon  at  Random  pick  of  Si  or  S2 

End  If 

If  (CP!  >  a)  then  SHOOT  at  Si 
If  (CP2>  a)  then  SHOOT  at  S2 

If  ((CP]  >  0)  &  (CP2  >  o) )  then  SHOOT  at  Random  pick  of  Si  or  S2 
Calculate  new  BV 
If  ( (BV  component  for  S3)  >  A)  then 
Declare  mission  complete 
Else  continue  processing  and  go  to  next  iteration 

See  tables  8  through  12  for  sample  calculations  performed  to  test  this  logic. 


3.  Discussion  and  Results 


The  simplest  version  of  the  POMDP  model  state  space  design  as  shown  in  figure  3  is  used  in  this 
report  to  evaluate  and  demonstrate  the  logic  through  the  computer  simulation  in  C3TRACE. 
Although  the  state  space  that  consists  of  two  location  states,  State  1  and  State 2,  and  one  status  state, 
StatCDead,  can  seem  trivial  and  unrelated  to  any  actual  human  performance  condition,  even  this 
simple  arrangement  can  relate  to  actual  performance.  The  seek-and-destroy  mission,  looking  to 
destroy  an  enemy  residing  at  some  unknown  location,  can  be  characterized  as  looking  or  shooting 
at  the  enemy  at  the  right  or  wrong  location  before  declaring  that  the  enemy  has  been  destroyed. 
Thus,  a  simple  form  of  the  state  space  such  as  this  can  form  the  basis  for  developing  logic  that  can 
be  expanded  after  verification  to  much  larger  location  state  spaces. 

A  means  for  evaluating  the  performance  of  the  simulation  is  to  implement  a  reward  structure  (RS) 
consisting  of  an  explicit  cost  for  taking  different  actions.  There  would  be  a  certain  cost  for  con¬ 
ducting  a  reconnaissance  and  another  greater  cost  for  conducting  an  artillery  strike.  There  would 
also  be  a  reward  if  the  mission  complete  declaration  is  made  when  the  enemy  has  actually  been 
destroyed  and  a  corresponding  large  cost  assessed  when  mission  complete  is  called  when  the 
enemy  has  not  been  destroyed. 

3.1  Control  Parameters 

The  A  and  a  control  parameters  for  the  BV  and  CP  calculations  allow  the  model  to  respond  to 
settings  for  aggressiveness  by  the  operator  in  making  decisions  to  perform  reconnaissance  or  to 
shoot  and  to  reflect  the  operator’s  confidence  when  a  successful  mission  has  occurred.  The  a 
control  parameter  is  used  to  set  the  reconnaissance  versus  shoot  threshold  criteria  for  the  perform¬ 
ance  of  the  simulation.  Values  of  a  during  analytical  runs  of  the  simulations  can  be  varied  to 
represent  the  complexity  and  decision  threshold  conditions  of  the  scenario  being  simulated.  The 
A  control  parameter  is  used  to  set  the  decision  threshold  criteria  for  the  performance  of  the 
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simulation.  Values  of  A  during  analytical  runs  of  the  simulations  can  be  varied  to  represent  the 
complexity  and  decision  threshold  conditions  of  the  scenario  being  simulated. 

3.1.1  Sigma  Parameter 

The  a  control  parameter  for  the  BV  and  CP  calculations  is  used  to  set  the  look  versus  shoot 
threshold  criteria  for  the  performance  of  the  simulation.  Values  of  a  during  analytical  runs  of  the 
simulations  can  be  varied  to  represent  the  complexity  and  decision  threshold  conditions  of  the 
scenario  being  simulated. 

3.1.2  Delta  Parameter 

The  A  control  parameter  is  used  to  set  the  decision  threshold  criteria  for  the  performance  of  the 
simulation.  Values  of  A  during  analytical  runs  of  the  simulations  can  be  varied  to  represent  the 
complexity  and  decision  threshold  conditions  of  the  scenario  being  simulated. 

3.2  Action  Sequence  Assessment 

In  order  to  examine  the  CP  logic  associated  with  actions  in  this  state  space,  an  example  of  actions 
and  the  resulting  belief  vectors  are  examined.  The  assumptions  are  that  the  enemy  is  located  in 
Statei  and  that  it  is  static,  i.e.,  not  moving.  Initially,  there  is  an  equal  probability  in  the  belief  of 
the  commander  that  the  enemy  could  be  in  either  of  the  location  states  and  a  belief  that  the  enemy 
is  alive.  The  initial  belief  vector  is  thus  [0.5,  0.5,  0.0],  meaning  a  50%  chance  of  being  in  location 
Statei,  a  50%  chance  of  being  in  location  States,  and  a  0.0%  chance  of  being  in  State  Dead,  i.e.,  the 
enemy  is  alive. 

3.2.1  Simulation  Action  Sequence  With  A  =  0.90  and  o  =  0.75 

Assume  that  the  control  parameter  values  are  initially  set  to 

a  =  0.75,  i.e.,  if  the  Belief  Probability  of  Enemy  Destroyed  (in  regard 

to  a  location  state)  >  a,  then  shoot,  otherwise  recon. 

A  =  0.90,  i.e.,  if  the  Belief  Probability  of  Enemy  Destroyed  (in  regard 

to  the  dead  state)  >  A,  then  declare  mission  complete. 

These  assumed  values  are  for  example  only  and  are  not  to  be  construed  to  represent  any  actual 
system. 

Applying  these  parameters  to  the  BV  and  CP  logic  generates  the  sequence  of  actions  as  shown  in 
figure  7  and  table  4.  Even  though  the  BV  component  for  StateDead  exceeds  A  at  iteration  5,  the 
model  run  was  continued  for  20  iterations  to  illustrate  the  action  sequence  asymptotic  relationships. 

Activating  the  A  control  parameter  causes  the  simulation  to  declare  mission  complete  after  iteration 
5  shown  in  table  9  with  a  StateDead  BV  component  =  0.9137  which  is  just  over  the  A  threshold  of 
0.90.  This  results  in  a  five-action  sequence  of  recon-shoot-recon-shoot-recon  to  declare.  If  an  RS 
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Belief  Probability 


is  implemented  with  a  cost  of  10  combat  power  points  to  recon  and  100  combat  power  points  to 
shoot,  then  the  cost  of  this  action  sequence  would  be  (3x10)  +  (2x100)  =  230. 


Sequence  (w/Enemy  @  SI):  Conditional  Probability  Action  Decisions 

▲  =0.9;  0  =  0.75 


-State_Space 

[1,1] 

-  State_Space 

[2,1] 

-  State_Space 

Dead 


Action 


Figure  7.  First  20  action  decisions  for  A  =  0.90  and  o  =  0.75. 
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Table  4.  First  20  action  decisions  for  A  =  0.90  and  o  =  0.75. 


Action 

State  Space 

If  11 

State  Space 
[2.1] 

State_Space 

Dead 

0-  Initial 

0  5000 

0.5000 

0.0000 

1-  Recon  to  SI 

07895 

02105 

0.0000 

2-  Shoot  to  SI 

0.1974 

0.2105 

0.5921 

3-  Recon  to  S2 

0  2308 

0  0769 

06923 

4-  Shoot  to  SI 

0.0577 

0  0769 

08654 

5-  Recon  to  S2 

0.0609 

0  0254 

09137 

Declare  Threshold 

6-  Recon  to  SI 

0.1957 

00217 

0.7826 

7-  Shoot  to  SI 

0.0489 

00217 

0  9293 

8-  Recon  to  SI 

0.1617 

00192 

0  8192 

9-  Shoot  to  SI 

0.0404 

00192 

0.9404 

10-  Recon  to  SI 

0.1364 

00172 

08463 

11-  Shoot  to  SI 

00341 

0.0172 

09487 

12-  Recon  to  SI 

0.1169 

00158 

08673 

13-  Shoot  to  SI 

00292 

00158 

09550 

14-  Recon  to  SI 

0.1015 

00146 

0  8840 

15-  Shoot  to  SI 

0  0254 

00146 

0  9600 

16-  Recon  to  SI 

0.0889 

0  0136 

0.8974 

17-  Shoot  to  SI 

00222 

00136 

0.9641 

18-  Recon  to  SI 

0.0786 

00129 

0  9086 

19-  Shoot  to  SI 

0.0196 

00129 

09675 

20-  Recon  to  SI 

0.0699 

0  0122 

09179 

Table  5.  First  20  action  decisions  for  A  =  0.90  and  a  =  0.75. 


Action 

State  Space 
[1.11 

State  Space 
[2.1] 

State_Space 

Dead 

0-  Initial 

0  5000 

0  5000 

0.0000 

1-  Recon  to  SI 

0.7895 

0.2105 

0.0000 

2-  Shoot  to  SI 

0.1974 

02105 

0.5921 

3-  Recon  to  S2 

0.2308 

0.0769 

0  6923 

4-  Shoot  to  SI 

0  0577 

0  0769 

08654 

5-  Recon  to  S2 

0  0609 

0  0254 

0.9137 

Declare  Threshold 

6-  Recon  to  SI 

0.1957 

00217 

0.7826 

7-  Shoot  to  SI 

0.0489 

0.0217 

0.9293 

8-  Recon  to  SI 

0.1617 

0.0192 

0.8192 

9-  Shoot  to  SI 

0.0404 

0.0192 

0.9404 

10-  Recon  to  SI 

0.1364 

0.0172 

0.8463 

11-  Shoot  to  SI 

00341 

00172 

0.9487 

12-  Recon  to  SI 

0.1169 

0.0158 

0  8673 

13-  Shoot  to  SI 

0  0292 

00158 

09550 

14-  Recon  to  SI 

0.1015 

0  0146 

0.8840 

15-  Shoot  to  SI 

0  0254 

0.0146 

0  9600 

16-  Recon  to  SI 

0.0889 

0  0136 

08974 

17-  Shoot  to  SI 

0  0222 

00136 

0  9641 

18-  Recon  to  SI 

0  0786 

00129 

0.9086 

19-  Shoot  to  SI 

0.0196 

0  0129 

0.9675 

20-  Recon  to  SI 

0.0699 

00122 

0  9179 

3.2.2  Simulation  Action  Sequence  With  A  =  0.90  and  a  =  0.89 

Applying  these  parameters  to  the  BV  and  CP  logic  generates  the  sequence  of  actions  as  shown  in 
figure  8  and  table  6.  Even  though  the  BV  component  for  Stateoead  exceeds  A  at  iteration  9,  the 
model  run  was  again  continued  for  20  iterations  to  illustrate  the  action  sequence  asymptotic 
relationships. 
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Sequence  (w/Enemy  @  SI):  Conditional  Probability  Action  Decisions 
▲  =0.9;  ct  =  0.89 


-  State_Space 

[1,1] 

-  State_Space 

[2,1] 

-  State_Space 

Dead 


( t, ) 
ub 


( t, ) 

d) 


(j)  £  co  q:  co  (2 


Action 


Figure  8.  First  20  action  decisions  for  A  =  0.90  and  a  =  0.89. 


Table  6.  First  20  action  decisions  for  A  =  0.90  and 
a  =  0.89. 


Action 

State  Space 
[1,1] 

State  Space 
[2,1] 

State_Space 

Dead 

0-  Initial 

05000 

05000 

0  0000 

1-  Recon  to  SI 

0  7895 

0  2105 

0.0000 

2-  Recon  to  SI 

09336 

0.0664 

0.0000 

3-  Shoot  to  SI 

0.2334 

0.0664 

0.7002 

4-  Recon  to  SI 

05331 

0.0404 

0  4265 

5-  Shoot  to  SI 

0.1333 

0  0404 

0  8263 

6-  Recon  to  SI 

0.3657 

00296 

06047 

7-  Shoot  to  SI 

0  0914 

00296 

0.8790 

8-  Recon  to  SI 

0  2740 

0.0236 

0.7024 

9-  Shoot  to  SI 

0.0685 

00236 

0.9079 

10-  Recon  to  SI 

02161 

0.0199 

0.7640 

11-  Shoot  to  SI 

0  0540 

00199 

0  9261 

12-  Recon  to  SI 

0.1764 

0.0173 

0  8063 

13-  Shoot  to  SI 

0  0441 

00173 

0.9386 

14-  Recon  to  SI 

0  1475 

0.0154 

0.8370 

15-  Shoot  to  SI 

00369 

00154 

0  9477 

16-  Recon  to  SI 

0  1256 

0.0140 

0.8604 

17-  Shoot  to  SI 

0.0314 

0.0140 

0.9546 

18-  Recon  to  SI 

0.1084 

0.0129 

0  8787 

19-  Shoot  to  SI 

0.0271 

00129 

0  9600 

20-  Recon  to  SI 

00945 

00120 

0.8934 
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For  this  action  sequence,  the  A  control  parameter  causes  the  simulation  to  declare  mission  complete 
after  iteration  9  shown  in  table  10  with  a  StateDead  BV  component  =  0.9079.  This  results  in  a  nine- 
action  sequence  of  R-R-S-R-S-R-S-R-S  to  declare.  If  we  evaluate  this  sequence  with  the  RS  with  a 
cost  of  10  combat  power  points  to  Recon  and  100  combat  power  points  to  shoot,  the  cost  of  the 
action  sequence  is  (5x10)  +  (4x100)  =  450. 

3.2.3  Simulation  Action  Sequence  With  A  =  0.90  and  a  =  0.55 

Applying  these  parameter  values  to  the  BV  and  CP  logic  generates  the  sequence  of  actions  as 
shown  in  figure  9  and  table  7.  Even  though  the  BV  component  for  StateDead  exceeds  A  at  iteration 
5,  the  model  run  was  also  continued  for  20  iterations  to  illustrate  the  action  sequence  asymptotic 
relationships. 


Figure  9.  First  20  action  decisions  for  A  =  0.90  and  o  =  0.55. 
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Table  7.  First  20  action  decisions  for  A  =  0.90  and  a  =  0.55. 


Action 

State  Space 
[1.1] 

State  Space 
[2.1] 

State_Space 

Dead 

0-  Initial 

05000 

05000 

0  0000 

1-  Recon  to  SI 

0.7895 

0.2105 

0.0000 

2-  Shoot  to  SI 

0.1974 

0.2105 

0.5921 

3-  Recon  to  S2 

0.2308 

0.0769 

0  6923 

4-  Shoot  to  SI 

0.0577 

0.0769 

0.8654 

5-  Shoot  to  S2 

0.0577 

00192 

0.9231 

6-  Shoot  to  SI 

0.0144 

0.0192 

0.9663 

7-  Shoot  to  S2 

0.0144 

0.0048 

0.9808 

8-  Shoot  to  SI 

0.0036 

0.0048 

0  9916 

9-  Shoot  to  S2 

0.0036 

0.0012 

0.9952 

10-  Shoot  to  SI 

00009 

0.0012 

0.9979 

11-  Shoot  to  S2 

0  0009 

0  0003 

0  9988 

12-  Shoot  to  SI 

0.0002 

0.0003 

0.9995 

13-  Shoot  to  S2 

0.0002 

00001 

0  9997 

14-  Shoot  to  SI 

0.0001 

0.0001 

0.9999 

15-  Shoot  to  S2 

0.0001 

0.0000 

0  9999 

16-  Shoot  to  SI 

0.0000 

0.0000 

1 .0000 

17-  Shoot  to  S2 

0.0000 

0.0000 

1.0000 

18-  Shoot  to  SI 

0.0000 

0.0000 

1  0000 

19-  Shoot  to  S2 

0  0000 

0  0000 

1  0000 

20-  Shoot  to  SI 

0.0000 

0.0000 

1.0000 

For  this  action  sequence,  the  A  control  parameter  causes  the  simulation  to  declare  mission  complete 
after  iteration  5  shown  in  table  8  with  a  Stateoead  BV  component  =  0.923 1 .  This  results  in  a  five- 
action  sequence  of  R-S-R-S-S  to  declare.  If  we  evaluate  this  sequence  with  the  RS  with  a  cost  of  10 
combat  power  points  to  Recon  and  100  combat  power  points  to  shoot,  then  the  cost  of  the  action 
sequence  is  (2x10)  +  (3x100)  =  320. 

3.3  Reward  Structure 

Evaluation  of  the  RS  as  a  is  varied  from  0.00  to  1.00  provides  an  indication  of  the  “cost  of  doing 
business”  based  on  how  aggressive  the  decision  maker  is  in  making  action  choices.  Figure  10 
shows  a  profile  of  the  RS  over  this  range  with  a  in  increments  of  0. 10.  Table  9  shows  an  expanded 
view  of  the  information  in  the  X  axis. 

The  intent  is  to  minimize  the  cost  of  doing  business  by  performing  the  least  costly  sequence  of 
actions  to  achieve  the  desired  belief  that  the  enemy  has  been  destroyed.  In  an  analytical  use  of  this 
model,  tailoring  the  reconnaissance  and  strike  asset  capabilities  so  that  they  support  an  action 
sequence  of  recon-shoot-recon-shoot-recon  to  achieve  the  belief  threshold  specified  would  allow 
the  system  to  be  tailored  for  optimal  performance  along  this  parameter.  Here,  the  optimal 
performance  occurs  over  the  range  of  c  =  0.59  to  0.75  with  a  resulting  action  cost  of  230. 
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Figure  10.  Action  cost  from  reward  structure  varied  by  o. 


Table  8.  Sigma,  o,  action  sequence  (R-  recon,  S-  strike,  RS- 
reward  structure  value). 


Recon  Cost 

10 

Shoot  Cost 

100 

cr,  Sigma-  Sequence 

RS 

a,  Sigma-  Sequence 

RS 

0.00-  S1-S20 

2000 

0.74-  RRSRS 

230 

0.01-  SSSSSSS 

700 

0.75-  RRSRS 

230 

0.05-  SSSSSS 

600 

0.76-  RRSRSRSRS 

450 

0.10-  SSSSS 

500 

0.77-  RRSRSRSRS 

450 

0.15-  SSSSS 

500 

0.78-  RRSRSRSRS 

450 

0.20-  SSSSS 

500 

0.79-  RSRRSRSRS 

450 

0.25-  SSSS 

400 

0.80-  RRSRSRSRS 

450 

0.30-  SSSS 

400 

0.81-  RRSRSRSRS 

450 

0.35-  SSSS 

400 

0.82-  RRSRSRSRS 

450 

0.40-  SSSS 

400 

0.83-  RRSRSRSRS 

450 

0.45-  SSSS 

400 

0.84-  RRSRSRSRS 

450 

0.50-  SSSS 

400 

0.85-  RRSRSRSRS 

450 

0.55-  RSRSS 

320 

0.86-  RRSRSRSRS 

450 

0.59-  RSRSR 

230 

0  87-  RRSRSRSRS 

450 

0.60-  RSRSR 

230 

0.88-  RRSRSRSRS 

450 

0.61-  RSRSR 

230 

0.89-  RRSRSRSRS 

450 

0.62-  RSRSR 

230 

0.90-  RRSRSRSRS 

450 

0.63-  RSRSR 

230 

0.91-  RRSRSRSRS 

450 

0.64-  RSRSR 

230 

0.92-  RRSRSRSRS 

450 

0.65-  RSRSR 

230 

0.93-  RRSRRSRSRS 

460 

0.66-  RSRSR 

230 

0.94-  RRRSRSRSRS 

460 

0.67-  RSRSR 

230 

0.95-  RRRSRSRSRS 

460 

0.68-  RSRSR 

230 

0.96-  RRRSRSRSRS 

460 

0.69-  RSRSR 

230 

0.97-  RRRSRSRSRS 

460 

0.70-  RSRSR 

230 

0.98-  RRRSRSRRSRSRS 

580 

0.71-  RRSRS 

230 

0,99-  RRRRSRSRS 

360 

0.72-  RRSRS 

230 

1 .00-  20 R 

200 

0.73-  RRSRS 

230 
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Table  9.  Sigma,  o,  action  sequence  (R-  recon,  S-  strike, 
RS-  reward  structure  value). 


Recon  Cost 

10 

Shoot  Cost 

100 

o',  Sigma-  Sequence 

RS 

cr,  Sigma-  Sequence 

RS 

0.00-  S1-S20 

2000 

0.74-  RRSRS 

230 

0.01-  SSSSSSS 

700 

0.75-  RRSRS 

230 

0.05-  SSSSSS 

600 

0.76-  RRSRSRSRS 

450 

0.10-  SSSSS 

500 

0.77-  RRSRSRSRS 

450 

0.15-  SSSSS 

500 

0.78-  RRSRSRSRS 

450 

0.20-  SSSSS 

500 

0.79-  RSRRSRSRS 

450 

0.25-  SSSS 

400 

0.80-  RRSRSRSRS 

450 

0.30-  SSSS 

400 

0.81-  RRSRSRSRS 

450 

0.35-  SSSS 

400 

0.82-  RRSRSRSRS 

450 

0.40-  SSSS 

400 

0.83-  RRSRSRSRS 

450 

0.45-  SSSS 

400 

0.84-  RRSRSRSRS 

450 

0.50-  SSSS 

400 

0.85-  RRSRSRSRS 

450 

0.55-  RSRSS 

320 

0.86-  RRSRSRSRS 

450 

0.59-  RSRSR 

230 

0.87-  RRSRSRSRS 

450 

0.60-  RSRSR 

230 

0.88-  RRSRSRSRS 

450 

0.61-  RSRSR 

230 

0.89-  RRSRSRSRS 

450 

0.62-  RSRSR 

230 

0.90-  RRSRSRSRS 

450 

0.63-  RSRSR 

230 

0.91-  RRSRSRSRS 

450 

0.64-  RSRSR 

230 

0.92-  RRSRSRSRS 

450 

0.65-  RSRSR 

230 

0.93-  RRSRRSRSRS 

460 

0.66-  RSRSR 

230 

0.94-  RRRSRSRSRS 

460 

0.67-  RSRSR 

230 

0.95-  RRRSRSRSRS 

460 

0.68-  RSRSR 

230 

0.96-  RRRSRSRSRS 

460 

0.69-  RSRSR 

230 

0.97-  RRRSRSRSRS 

460 

0.70-  RSRSR 

230 

0.98-  RRRSRSRRSRSRS 

580 

0.71-  RRSRS 

230 

0.99-  RRRRSRSRS 

360 

0.72-  RRSRS 

230 

1.00-  20 R 

200 

0.73-  RRSRS 

230 

4.  Conclusions  and  Future  Work 


The  current  work  has  established  a  model  that  supports  a  computer  simulation  capable  of  deter¬ 
mining  and  optimizing  optimal  decisions  during  conditions  of  uncertainty  according  to  evaluations 
of  the  BV  about  the  current  state,  action  decisions  based  on  CP  logic,  and  optimal  performance 
determination  through  the  evaluation  of  action  cost  from  the  RS.  Thus,  the  C3TRACE  simulation 
employing  this  model  can  make  action  decisions  based  on  conditional  probability  evaluations  of  the 
belief  state  representing  the  current  situation.  These  action  decisions  are  oriented  toward  a  goal- 
directed  optimal  outcome,  and  subsequently  recognize  when  the  belief  has  been  achieved  and  the 
outcome  reached. 

This  report  demonstrates  the  logic  of  this  model  through  the  evaluation  of  the  most  simple  of  state 
spaces.  This  sample  state  space  consists  of  a  2x1  location  state  matrix  and  a  single  status  state  of 
the  dead  condition  for  a  total  three-state  space  system.  Future  work  will  expand  the  location  state 
space  matrix  to  2x2  and  5x5  as  shown  in  figures  1  and  2,  along  with  more  sophisticated  enemy 
actions  for  moving  versus  static  operations  and  goal-directed  movement  activities. 
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Appendix  A.  Belief  Vector  Calculations  for  a  Selected  Five- Action  Sequence 


Belief  vector  calculations  for  the  first  five  action  sequences  are  presented  here. 

Parameter  Definitions 

Tables  1  through  3  are  repeated  in  figure  A-l  with  parameter  identifications  to  clarify  which 
parameter  is  being  used  in  which  calculation.  These  parameter  identifications  are  shown 
with  a  number  inside  a  circle  (e.g.,  (D@).  These  are  used  to  identify  original  constants  and 
the  results  of  each  calculation  to  eliminate  confusion  as  to  which  parameter  constant  is  being 
applied  where. 


33 


4^ 


This  is  a  matrix  of  the  POMDP  conditional  probability  conditions  for  each  combination  of  Location  State  occupied,  Location  State  Reconed,  and  Location  State  Struck 


General  Form  of  Bayes  rule: 

P(Br|A) 

PCBrIA) 

= 

PrBrrbA'i 

= 

P  fBrl  P  (A  |  Br*) 

for r  =  1,2, ...  k 

2  i.l*  PfBinA) 

Zi=l*P(Bi)P(A|Bi) 

Bayesian  Updating  rule  for  this  application: 

Definition: 

p(s'|b.o,a) 

p(s'|b,o,a) 

= 

p  fols'.b.a1!  pfs'lb.c 

a 

'State  Destroyed'  is  an  "Absorbing  State"  meaning  that  once  the  enemy  is  there 

P(o|b,a) 

they  cannot  get  out,  i.e.,  returned  to  'alive'. 

where, 

s' 

= 

true  state  (of  the  condition  being 

present  within  the  total  of  all  states,  S),  represented  as:  s'  1  S 

b 

= 

prior  belief 

0 

= 

observation 

a 

= 

action  that  was  generated 

1 

Table  1  - 

The  set  of  actions  and  their  observations  for  the  Seek  &  Destro 

task. 

Initial  State  Probabilities 

Action 

Observation 

State 

Probability 

Enemy  in  State  1 

= 

0.50 

©@ 

Recon 

Enemy  Sighted 

Enemy  Present 

075 

© 

Enemy  in  State  2 

= 

0.50 

©© 

Recon 

Enemy  Not  Sighted 

Enemy  Present 

025 

Enemy  in  State  Dead 

= 

000 

©© 

Recon 

Enemy  Sighted 

Enemy  Not  Present 

0.20 

© 

Recon 

Enemy  Not  Sighted 

Enemy  Not  Present 

0.80 

© 

The  2  cases  for  Enemy  Sighted 

vs.  Enemy  Not  Sighted  are: 

Table  2  - 

Probabilities  for  Observation  from  Artillery  Strike 

Recon  to  Statel 

Recon  to  State2 

Action 

Observation 

State 

Probability 

Recon  Sighted 

[  75.  2,  .2] 

[  2,  .75,  .2] 

Strike 

Nolnfo 

All  Cases 

TOO 

© 

Recon  Not  Sighted 

[  25,  .8,  .8] 

[  8,  .25,  .8] 

Arty-  Nolnfo 

[1.0, 1.0, 1.0] 

[1.0, 1.0, 1.0] 

Table  3  - 

Probabilities  for  Killing  Enemy  from  Artillery  Strike 

Action 

Result 

State 

Probability  of  Dead 

Transitioning  Probabilities: 

Strike 

SUCCESS 

Enemy  Present 

0.75 

© 

Transition  Into  Loc.  State 

= 

0.00 

©  for  static  enemy  case 

Strike 

FAILURE 

Enemy  Present 

0.25 

Trans.  Out  Of  Loc.  State 

= 

Prev.  Prob  for  Killing 

Strike 

SUCCESS 

Enemy  Not  Present 

0.00 

Trans.  Out  Of  Dead  State 

= 

0.00 

@@  as  enemy  cannot  transition  out  of  the  dead  state  once  there 

Figure  A-l.  Initial  constants  for  belief  vector  calculations. 


Reconi  to  SI 

Action 

Observation 

State  Action  To  State 

Effect 

[  p  (o|s',b.a) 

X 

p(s'|b,a)  ] 

/ 

P(o|b,a) 

=  p(s'|b,o,a) 

Occupied 

Enemy  Sighted 

Description: 

Table  1  Lookup 

X 

Previous  Prob 
of  Enemy  in  State 
Occupied 

/ 

***  <This  denominator  factor  is  the  normalizing  factor> 

(Prob  of  En  in  State  Action  To  *  Table  1  Lookup  for  that  State)  + 

(Prob  of  En  in  State  Action  Not  To  *  Table  1  Lookup  for  that  State)  + 

(Prob  of  En  in  State  Dead  ‘Table  1  Lookup  for  condition  applicable  to  State  Dead) 
(i.e. ,  The  Table  1  Lookup  related  to  State3  is  the  prob  of  a  false  alarm) 

=  0.5x0.75  +  0.5x0. 2  +  0.0x0.2  =  .375  +  0.1  =  .475 

=  ©® 

->© 

->©@ 

=  ©@x©  +  ©©x®  +  ©®x® 

Reconi 

0.75  1  1 

Sample  Calculation: 

0.75 

X 

0.5000 

/ 

0.4750 

=  0  7895 

Enemy  Sighted 

Description: 

Table  1  Lookup 

X 

Previous  Prob 
of  Enemy  in  State 
Occupied 

/ 

***  <This  denominator  factor  is  the  normalizing  factor> 

(Prob  of  En  in  State  Action  To  ‘Table  1  Lookup  for  that  State)  + 

(Prob  of  En  in  State  Action  Not  To  ‘  Table  1  Lookup  for  that  State)  + 

(Prob  of  En  in  State  Dead  ‘Table  1  Lookup  for  condition  applicable  to  State  Dead) 
(i.e. ,  The  Table  1  Lookup  related  to  State3  is  the  prob  of  a  false  alarm) 

=  0.5x0.75  +  0.5x0. 2  +  0.0x0.2  =  .375  +0.1  =  .475 

=  ©© 

->© 

—©© 

=  ©@X©  +  ©©X®  +  ©®x® 

Reconi 

0.20 

1  2 

Sample  Calculation: 

0.20 

X 

0.5000 

/ 

0.4750 

=  02105 

Enemy  Sighted 

Description: 

Table  1  Lookup 

X 

Previous  Prob 
of  Enemy  in  State 
Occupied 

/ 

***  <This  denominator  factor  is  the  normalizing  factor> 

(Prob  of  En  in  State  Action  To  *  Table  1  Lookup  for  that  State)  + 

(Prob  of  En  in  State  Action  Not  To  *  Table  1  Lookup  for  that  State)  + 

(Prob  of  En  in  State  Dead  ‘Table  1  Lookup  for  condition  applicable  to  State  Dead) 
(i.e. ,  The  Table  1  Lookup  related  to  State3  is  the  prob  of  a  false  alarm) 

=  0.5x0.75  +  0.5x0. 2  +  0.0x0.2  =  .375  +  0.1  =  .475 

=  ©© 

— >©® 

=  ©@x©  +  ©©x®  +  ©®x® 

Reconi 

6.20 

1  3 

Sample  Calculation: 

0.80 

X 

0.0000 

/ 

0.4750 

=  0  0000 

Check  Sum 

=  1  0000 

Figure  A-2.  Belief  vector  calculations  for  first  action:  recon!  to  statej. 


Strikel  to  SI 

Action 

Result  State  Action  To  State 

Effect 

[  p  (o|s',b.a) 

X 

p(s'|b,a)  ) 

/ 

P(°IM 

p(s’|b  .o  ,a) 

Occupied 

i-  " 1  ■ 

Description: 

Table  2  Lookup 

X 

[p  (In-  St  ate)  x  (prev. 

prob.  for  State 
Struck]  +  [(1.0-p(Out- 
State)  x  (prev.  prob 
for  State  Occupied)] 
[(Ox.7895)  +  ((1- 
75)x.7895)] 

/ 

Table  2  Lookup 

= 

=  ®© 

— © 

©x©(D  +  [(1.0- 
©)  x  (©(D)] 

^© 

Strikel 

i  .66  i  i 

Sample  Calculation: 

1.00 

X 

0.1974 

/ 

1.00 

= 

6.1974 

Kill  Success 

Description: 

Table  2  Lookup 

X 

[p  (In- St  ate)  x  (prev. 

prob.  for  State 
Struck]  +  [(1.0-p(Out- 
State)  x  (prev.  prob 
for  State  Occupied)] 
[(Ox.7895)  +  ((1- 
0)x.2105)] 

/ 

Table  2  Lookup 

=  ®© 

— © 

©x®® +  [(1.0- 
®)  x  (©©)] 

— © 

Strikel 

1 .66  1  2 

Sample  Calculation: 

1.00 

X 

0.2105 

/ 

1.00 

= 

0  2105 

Kill  Success 

Description: 

Table  2  Lookup 

X 

[p  (In- St  ate)  x  (prev. 

prob.  for  State 
Struck]  +  [(1.0-p(Out- 
State)  x  (prev.  prob 
for  State  Occupied)] 
[(.75x.7895)  +  ((1- 
DM] 

/ 

Table  2  Lookup 

= 

=  ®@ 

->© 

®x®®  +  [(1.0- 
@@)  x  (®®)] 

^© 

Strikel 

1.00  1  3 

Sample  Calculation: 

1  00 

X 

0.5921 

/ 

1.00 

= 

0  5921 

Check  Sum 

1  0000 

Figure  A-3.  Belief  vector  calculations  for  second  action:  strike]  to  state]. 


Recon2  to  S2 

Action 

Observation 

State  Action  To  State 

Effect 

[  p  (ols'.M 

X 

P(S'IM  ) 

p(o|b,a) 

= 

p(s'|b,o,a) 

Occupied 

Enemy  Not  Sighted 

Description: 

Table  1  Lookup 

X 

Previous  Prob 
of  Enemy  in  State 
Occupied 

f 

***  <This  denominator  factor  is  the  normalizing  factor> 

(Prev  Prob  of  En  in  State  Action  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Action  Not  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Dead  *  Table  1  Lookup  for  condition  applicable  to  State 
Dead) 

(i.e. ,  The  Table  1  Lookup  related  to  State3  is  the  prob  of  a  false  alarm) 

=  0.2105x0.25  +0.1974x0.8  +0.5921x0.8  =  .6842 

= 

=  @® 

^© 

=  ®© 

=  ®©x®  +  ®®x©  +  ®@x@ 

Recon2 

2  1 

Sample  Calculation: 

0.80 

X 

0.1974 

1 

0.6842 

= 

0  2308 

Enemy  Not  Sighted 

Description: 

Table  1  Lookup 

X 

Previous  Prob 
of  Enemy  in  State 
Occupied 

/ 

***  <This  denominator  factor  is  the  normalizing  factor> 

(Prev  Prob  of  En  in  State  Action  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Action  Not  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Dead  *  Table  1  Lookup  for  condition  applicable  to  State 
Dead) 

(i.e. ,  The  Table  1  Lookup  related  to  State3  is  the  prob  of  a  false  alarm) 

=  0.2105x0.25  +  0.1974x0.8  +  0.5921  x0.8  =  .6842 

= 

=  @® 

=  ®© 

=  ®©x®  +  ®©x©  +  ®@x© 

IliiiiBi 

Recon2 

2  2 

Sample  Calculation: 

0.25 

X 

0.2105 

/ 

0.6842 

= 

0  0769 

illlllll!  Enemy  Not  Sighted 

Description: 

Table  1  Lookup 

X 

Previous  Prob 
of  Enemy  in  State 
Occupied 

/ 

***  <This  denominator  factor  is  the  normalizing  factor> 

(Prev  Prob  of  En  in  State  Action  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Action  Not  To  '‘'Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Dead  *  Table  1  Lookup  for  condition  applicable  to  State 
Dead) 

(i.e. ,  The  Table  1  Lookup  related  to  State3  is  the  prob  of  a  false  alarm) 

=  0.2105x0.25  +0.1974x0.8  +0.5921x0.8=  .6842 

= 

=  Q>® 

-»® 

=  ®@ 

=  ®©x®  +  ®@x©  +  ®@x@ 

Recon2 

2  3 

Sample  Calculation: 

080 

X 

0.5921 

/ 

0.6842 

= 

0  6923 

Check  Sum 

= 

1.0000 

Figure  A-4.  Belief  vector  calculations  for  third  action:  recon2  to  state2. 


Strike2  to  SI 

Action 

Result  State  Action  To  State 

Effect 

[  p  (o|s',b.a) 

X 

p(s'|b,a)  ] 

/ 

p(o|b,a) 

= 

pfs’lb.o.a) 

Occupied 

i-  " 1  ■ 

Description: 

Table  2  Lookup 

X 

[p  (In-  St  ate)  x  (prev. 

prob.  for  State 
Struck]  +  [(1.0-p(Out- 
State)  x  (prev.  prob 
for  State  Occupied)] 
[(Ox. 2308)  +  ((1- 
75)x.1468)] 

/ 

Table  2  Lookup 

= 

=  0© 

— © 

=  (©x@®)  +  ((1.0- 
©)  x  ®<D) 

^© 

Strike2 

i  .66  i  i 

Sample  Calculation: 

1.00 

X 

0.0577 

/ 

1.00 

= 

0  0577 

Kill  Success 

Description: 

Table  2  Lookup 

X 

[p  (In- St  ate)  x  (prev. 

prob.  for  State 
Struck]  +  [(1.0-p(Out- 
State)  x  (prev.  prob 
for  State  Occupied)] 
[(Ox. 7895)  +  ((1- 
0)x.0769)] 

/ 

Table  2  Lookup 

= 

=  @© 

— © 

=  (®x@®)  +  ((1.0- 
®)  x  ©@) 

— © 

Strike2 

1.00  1  2 

Sample  Calculation: 

1.00 

X 

0.0769 

/ 

1.00 

= 

0  0769 

Kill  Success 

Description: 

Table  2  Lookup 

X 

[p  (In- St  ate)  x  (prev. 

prob.  for  State 
Struck]  +  [(1.0-p(Out- 
State)  x  (prev.  prob 
for  State  Occupied)] 
[(.75x.2308)  +  ((1- 
0)x.6923)] 

/ 

Table  2  Lookup 

= 

=  @@ 

->© 

=  (®x@®)  +  ((1.0- 
®®)  x  ®@) 

— © 

Striked 

1.00  1  3 

Sample  Calculation: 

1  00 

X 

0.8654 

/ 

1.00 

= 

0  8654 

Check  Sum 

= 

1  0000 

Figure  A-5.  Belief  vector  calculations  for  fourth  action:  strike2  to  state2. 


Recon3  to  S2 

Action  Observation  State  Action  To  State 

Effect 

[  P  (o|s',b.a) 

X 

p(s'|b,a)  ] 

/ 

p(o|b,a) 

= 

p(s'|b,o,a) 

Occupied 

Enemy  Not  Sighted 

Description: 

Table  1  Lookup 

X 

Previous  Prob 
of  Enemy  in  State 
Occupied 

/ 

***  <This  denominator  factor  is  the  normalizing  factor> 

(Prev  Prob  of  En  in  State  Action  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Action  Not  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Dead  *  Table  1  Lookup  for  condition  applicable  to  State 
Dead) 

(i.e. ,  The  Table  1  Lookup  related  to  State3  is  the  prob  of  a  false  alarm) 

=  0.2105x0.25  +0.1974x0.8  +0.5921x0.8=  .6842 

= 

=  (D© 

^© 

=  ®© 

=  ®®x®  +  ®®x©  +  ®®x@ 

Recon2  2 

Sample  Calculation: 

0.80 

X 

0.0577 

/ 

0.7577 

= 

0  0609 

Enemy  Not  Sighted 

Description: 

Table  1  Lookup 

X 

Previous  Prob 
of  Enemy  in  State 
Occupied 

/ 

***  <This  denominator  factor  is  the  normalizing  factor> 

(Prev  Prob  of  En  in  State  Action  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Action  Not  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Dead  *  Table  1  Lookup  for  condition  applicable  to  State 
Dead) 

(i.e. ,  The  Table  1  Lookup  related  to  State3  is  the  prob  of  a  false  alarm) 

=  0.2105x0.25  +  0.1974x0.8  +  0.5921  x0.8  =  .6842 

= 

=  ®@ 

->© 

=  ®© 

=  ®®x®  +  ®©x©  +  ®<3>x© 

Recon2  2  2 

Sample  Calculation: 

0.25 

X 

0.0769 

/ 

0.7577 

= 

0  0254 

Enemy  Not  Sighted 

Description: 

Table  1  Lookup 

X 

Previous  Prob 
of  Enemy  in  State 
Occupied 

/ 

***  <This  denominator  factor  is  the  normalizing  factor> 

(Prev  Prob  of  En  in  State  Action  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Action  Not  To  *  Table  1  Lookup  for  that  State)  + 

(Prev  Prob  of  En  in  State  Dead  *  Table  1  Lookup  for  condition  applicable  to  State 
Dead) 

(i.e.,  The  Table  1  Lookup  related  to  State3  is  the  prob  of  a  false  alarm) 

=  0.2105x0.25  +0.1974x0.8  +0.5921x0.8=  .6842 

= 

=  ®© 

-»© 

=  ®@ 

=  ®©x©  +  <3>©x©  +  @©x@ 

Recon2  2  3 

Sample  Calculation: 

0.80 

X 

0.8654 

/ 

0.7577 

= 

09137 

Check  Sum 

= 

1 .0000 

Figure  A-6.  Belief  vector  calculations  for  fifth  action:  recon3  to  state3. 


Intentionally  left  blank 
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Appendix  B.  Conditional  Probability  Calculations  for  First  Five  Action 
Sequences 


The  following  5  tables  illustrate  the  conditional  probability  calculations  for  the  sequence  RECONi 
to  Si,  STRIKE]  to  Si,  RECON2  to  S2,  STRIKE,  to  Si,  and  RECON3  to  S2. 

Action  1:  Make  Choice  to  Perform  Recon  1  to  Si. 

The  first  action  decision  to  be  made  by  the  model  is  whether  to  recon  or  shoot  at  Si  or  S2.  This  is 
based  on  the  initial  conditions  of  the  state  space  where  the  BV  has  been  previously  defined  as  [0.5, 
0.5,  0.0],  It  is  also  noted  that  the  enemy  is  occupying  Statei.  Using  Equation  6  the  numerical 
evaluation  of  the  CP,  based  on  the  limitations  of  the  a  and  A  thresholds,  is  performed  resulting  in  a 
first  action  to  perfonn  a  reconnaissance  mission  to  Statei.  The  results  of  the  action  is  to  generate  a 
new  BV  =  [0.7895,  0.2105,  0.0000]. 


Table  B-l.  Conditional  probability  calculations  for  first  action:  Reconi  to  Si. 


Initial  Belief 
Vector 

CP 

Exceed  A 
Threshold  ? 

Exceed  0 
Threshold  ? 

Choice 

Observation 

r  s,  ,  s2  1 

[0.5, 0.5, 0.0] 

Ls!+s2  s,+  s2J 

NO 

NO 

As  both  Si  &  S2 

ENEMY  SIGHTED 

meet  recon 

From  this  action  of 

r  0.5  ,  0.5  1 

Because 

Because 

criteria  with  value 

RECON,  to  Si 

Lo.5+0.5  0.5+0.5J 

DEAD  state 

neither 

=  0.5,  Select, 

when  enemy  @  Si. 

with  a  BV 

location  cell 

RAN  (i) 

=  [0.5 , 0.5] 

value  =  0.0  is 

has  a  value  > 

Assuming 

Generating  a  Belief 

less  than  A 

0.75  as  each 

Random  Pick=l. 

Vector  = 

.'.For  Action  where 

which  is 

cell  has  a 

.'.  Choice  = 

[0.7895,0.2105,0.0] 

Dead  =  0.0,  the 

equal  to  0.9. 

contrast  ratio 

RAND  recon  (SO, 

denominator  must 

value  =  0.5. 

or  perform  a  recon 

sum  to  1.0,  and  the 

to  Si,  or  perform 

total  contrast  must 

RECON,. 

sum  to  1.0. 

Action  2:  Make  Choice  to  Perform  Strikei  to  Si. 

Using  the  BV  from  Action  #1  of  [0.7895,  0.2105,  0.0000],  the  CP  is  now  evaluated  to  select 
Action  #2  to  be  to  perfonn  an  artillery  strike  to  Si  generating  a  new  BV  =  [0.1974,  0.2105, 
0.5921]. 
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Table  B-2.  Conditional  probability  calculations  for  second  action:  Strike;  to  S;. 


Previous 
Belief  Vector 

CP 

Exceed  A 
Threshold  ? 

Exceed  o 
Threshold  ? 

Choice 

Observation 

[0.7895, 

1“  S;  ,  S2  1 
LS;+  s2  S,+  S2J 

NO 

YES 

Because  Si  and 

NO  INFO 

0.2105,  0.0] 

r  0.7895  . 

Because 

Because  cell 

only  Si  meets 
SHOOT  criteria, 

(From  this  action  of 

0.2105  1 

DEAD  state 

Si  has  a 

Select,  Shoot  Si 

STRIKE  1  to  Si.) 

Lo. 7895+0. 2105 
0.7895+  0.2105J 

with  a  Belief 
Vector  value 

Contrast 

Ratio  value  > 

because  S2  @ 
0.2105  is  <  o  at 

Generating  a  Belief 

=  0.0  is  less 

0.75 

0.75.  For  future 

Vector  = 

=  [0.7895,0.2105] 

.'.For  Action  where 
Dead  Belief  =  0.0, 
the  denominator 
must  sum  to  1.0, 
and  the  total 
contrast  must  sum 

to  1.0. 

than  A  which 
is  equal  to 

0.9. 

cases  where  there 
might  be  multiple 
location  states 
exceeding  o,  set 
up  the  general 
selection  of 

RAND  Shoot 
[S,].  In  this  case 
the  choice  is  to 
SHOOT  at  Si. 

[0.1974,0.2105, 

0.5921] 

Action  3:  Make  Choice  to  Perform  Recon2  to  S2. 


Table  B-3.  Conditional  probability  calculations  for  third  ction:  Recon2  to  S2. 


Previous 
Belief  Vector 

CP 

Exceed  A 
Threshold  ? 

Exceed  o 
Threshold  ? 

Choice 

Observation 

[0.1974, 

0.2105, 

0.5921] 

r  &  ,  s2  i 

LSi+S2  s,+  s2J 

r  0.1974  . 

0.2105  1 

Lo. 1974+0.2105 
0.1974+ 0.2105J 

=  [0.4839,0.5161] 

.'.For  Action  where 
Dead  Belief^  0.0, 
the  denominator 
will  not  sum  to  1.0, 
but  the  total  contrast 
must  still  sum  to 

1.0.,  or 

0.4839  +  .5160  = 

1.0 

NO 

Because 
DEAD  state 
with  a  Belief 
Vector  value 
=  0.5921  is 
less  than  A 
which  is 
equal  to  0.9. 

NO 

Because 
neither  cell 
has  a  CR 
value  >  0.75 
with  Si= 
0.4839  and 
with  S2= 
0.5160 

Because  neither  Si 
or  S2  meets  o 
criteria,  select, 

RAN  (i)  of  set  of 
cells  w/  largest 
contrast  value,  in 
this  case  only  S2 
@  0.5160  is  in  the 
set  of  cells 
containing  the 
largest  contrast 
value, 

thus  select  a  recon 
into  S2,  i.e., 
perform  RECON2 
into  S2. 

ENEMY  NOT 
SIGHTED 

From  this  action  of 
RECON,  to  S2 
when  enemy  @  Si. 

Generating  a  Belief 
Vector  = 
[0.2308,0.0769, 
0.6923] 
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Action  4:  Make  Choice  to  Perform  Strike2  to  Si. 


Using  the  BV  from  Action  #3  of  [0.2308,  0.0769,  0.6923],  the  CP  is  now  evaluated  to  determine 
Action  #4  to  be  to  perfonn  an  artillery  strike  to  Si  generating  a  new  BV  =  [0.0577,  0.0769, 
0.8654]. 


Table  B-4.  Conditional  probability  calculations  for  fourth  action:  Strike2  to  Si. 


Previous 
Belief  Vector 

CP 

Exceed  A 
Threshold  ? 

Exceed  o 
Threshold  ? 

Choice 

Observation 

[0.2308, 

r  Sj  ,  s2  i 

Ls!+s2  Si+  s2J 

NO 

YES 

Because  Si  and 
only  Si  meets 

NO  INFO 

0.0769, 

0.6923] 

r  0.2308  . 

Because 

Because  cell 

SHOOT  criteria, 
Select,  Shoot  Si 

(From  this  action 

0.0769  1 

DEAD  state 

Si  has  a 

because  S2  @ 

of  STRIKE2  to 

Lo. 2308+0. 0769 

with  a  Belief 

Contrast 

0.0769  is  <  o  at 

Si-) 

0.2368+ 0.0769J 

=  [0.75008 , 0.2499] 

Vector  value 
=  0.6923  is 
less  than  A 

Ratio  value  = 
0.750081 
which  is  >  c 

0.75.  For  future 
cases  where  there 
might  be  multiple 

Generating  a 
Belief  Vector  = 

.'.For  Action  where 
Dead  Belief^  0.0, 
the  denominator 
will  not  sum  to  1.0, 
but  the  total  contrast 
must  still  sum  to 

1.0.,  or 

0.750081  +  .249919 
=  1.0 

which  is 
equal  to  0.9. 

at  0.75. 

location  states 
exceeding  o,  set 
up  the  general 
selection  of 

RAND  Shoot 
(Si).  Thus  select 
SHOOT  at  Si, 
i.e.,  perform 
STRIKE2  at  S,. 

[0.0577,0.0769, 

0.8654] 

Action  5:  Make  Choice  to  Perform  Recon3  to  S2. 

Using  the  BV  from  Action  #4  of  [0.0577,  0.0769,  0.8654].,  the  CP  is  now  evaluated  to  determine 
Action  #5  to  be  to  perform  a  reconnaissance  mission  to  S2  generating  a  new  BV  =  [0.0609,  0.0254, 
0.9137].  As  the  new  BV  component  for  State3  at  0.9137  now  exceeds  the  A  threshold  of  0.90,  the 
model  makes  the  decision  to  terminate  with  a  declaration  of  ‘Mission  Complete’  with  the  belief 
that  the  enemy  has  been  destroyed. 
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Table  B-5.  Conditional  probability  calculations  for  fifth  action:  Recon3  to  S2 


Previous 

CP 

Exceed  A 

Exceed  o 

Choice 

Observation 

Belief  Vector 

Threshold  ? 

Threshold  ? 

r  Sj  ,  s2  i 

[0.0577, 

LSi+  s2  s,+  s2J 

NO 

NO 

Because  neither  Si 

ENEMY  NOT 

0.0769, 

or  S2  meets  o 

SIGHTED 

0.8654] 

r  0.0577  . 

Because 

Because 

criteria,  select, 

0.0769  1 

DEAD  state 

neither  cell 

RAN  (i)  of  set  of 

From  this  action  of 

Lo. 0577+0. 0769 
0.0577+ 0.0769J 

with  a  Belief 
Vector  value 
=  0.8654  is 

has  a  CR 
value  >  0.75 
with  S  !“ 

cells  w/  largest 
contrast  value,  in 
this  case  only  S2 

RECON,  to  S2 
when  enemy  @  Si. 

=  [0.4287,0.5713] 

less  than  A 
which  is 

0.4287  and 
with  S2= 

@  0.5713  is  in  the 
set  of  cells 

Generating  a  Belief 
Vector  = 

.'.For  Action  where 
Dead  Belief  ^  0.0, 
the  denominator 
will  not  sum  to  1.0, 
but  the  total  contrast 
must  still  sum  to 

1.0.,  or 

0.4287  +  .5713  = 

1.0 

equal  to  0.9. 

0.5713 

containing  the 
largest  contrast 
value, 

thus  select  a  recon 
into  S2,  i.e., 
perform  RECON3 
into  S2. 

[0.0609,0.0254, 

0.9137] 

Note:  A  threshold  now  exceeded  with  DEAD  state  Belief  Vector  value  =  0.9137  which  is  greater  than  A  at  0.9, 
therefore  next  action  will  be  to  DECLARE. 
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NO.  OF 

COPIES  ORGANIZATION 


NO.  OF 

COPIES  ORGANIZATION 


1  DEFENSE  TECHNICAL 
(PDF  INFORMATION  CTR 
ONLY)  DTICOCA 

8725  JOHN  J  KINGMAN  RD 
STE  0944 

FORT  BELVOIR  VA  22060-6218 

1  US  ARMY  RSRCH  DEV  &  ENGRG  CMD 
SYSTEMS  OF  SYSTEMS 
INTEGRATION 
AMSRD  SS  T 
6000  6TH  ST  STE  100 
FORT  BELVOIR  VA  22060-5608 

1  DIRECTOR 

US  ARMY  RESEARCH  LAB 
IMNE  ALC  IMS 
2800  POWDER  MILL  RD 
ADELPHI  MD  20783-1197 

1  DIRECTOR 

US  ARMY  RESEARCH  LAB 
AMSRD  ARL  Cl  OK  TL 
2800  POWDER  MILL  RD 
ADELPHI  MD  20783-1197 

2  DIRECTOR 

US  ARMY  RESEARCH  LAB 
AMSRD  ARL  CS  OK  T 
2800  POWDER  MILL  RD 
ADELPHI  MD  20783-1197 

1  ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  M  DR  M  STRUB 
6359  WALKER  LANE  SUITE  100 
ALEXANDRIA  VA  22310 

1  ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  ML  J  MARTIN 
MYER  CENTER  RM  2D3 1 1 
FT  MONMOUTH  NJ  07703-5601 

1  ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MZ  A  DAVISON 
199  E  4TH  ST  STE  C  TECH  PARK  BLDG  2 
FT  LEONARD  WOOD  MO  65473-1949 

1  ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MD  T  COOK 
BLDG  5400  RM  C242 
REDSTONE  ARSENAL  AL  35898-7290 


1  COMMANDANT  USAADASCH 

ATTN  AMSRD  ARL  HR  ME  A  MARES 
5800  CARTER  RD 
FT  BLISS  TX  79916-3802 

1  ARMY  RSCH  LABORATORY  -  HRED 

ATTN  AMSRD  ARL  HR  MO  J  MINNINGER 

BLDG  5400  RM  C242 

REDSTONE  ARSENAL  AL  35898-7290 

1  ARMY  RSCH  LABORATORY  -  HRED 

ATTN  AMSRD  ARL  HR  MM  DR  V  RICE-BERG 

BLDG  4011  RM  217 

1750  GREELEY  RD 

FT  SAM  HOUSTON  TX  78234-5094 

1  ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MG  R  SPINE 
BUILDING  333 

PICATINNY  ARSENAL  NJ  07806-5000 

1  ARL  HRED  ARMC  FLD  ELMT 

ATTN  AMSRD  ARL  HR  MH  C  BURNS 
BLDG  1467B  ROOM  336 
THIRD  AVENUE 
FT  KNOX  KY  40121 

1  ARMY  RSCH  LABORATORY  -  HRED 
AVNC  FIELD  ELEMENT 
ATTN  AMSRD  ARL  HR  MJ  D  DURBIN 
BLDG  4506  (DCD)  RM  107 
FT  RUCKER  AL  36362-5000 

1  ARMY  RSCH  LABORATORY  -  HRED 

ATTN  AMSRD  ARL  HR  MK  MR  J  REINHART 
10125  KINGMAN  RD 
FT  BELVOIR  VA  22060-5828 

5  ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MV  HQ  USAOTC 
S  MIDDLEBROOKS 
91012  STATION  AVE  ROOM  348 
FT  HOOD  TX  76544-5073 

1  ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MY  M  BARNES 
2520  HEALY  AVE  STE  1 172  BLDG  51005 
FT  HUACHUCA  AZ  85613-7069 
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NO.  OF 
COPIES 

1 

1 


1 

1 

1 

1 

1 

1 

1 


ORGANIZATION 


NO.  OF 

COPIES  ORGANIZATION 


ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MP  D  UNGV ARSKY 
BATTLE  CMD  BATTLE  LAB 
415  SHERMAN  AVE  UNIT  3 
FT  LEAVENWORTH  KS  66027-2326 

ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MJK  J  HANSBERGER 
JFCOM  JOINT  EXPERIMENTATION  J9 
JOINT  FUTURES  LAB 
115  LAKEVIEW  PARKWAY  SUITE  B 
SUFFOLK  VA  23435 

ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MQ  M  R  FLETCHER 
US  ARMY  SBCCOM  NATICK  SOLDIER  CTR 
AMSRD  NSC  SSE  BLDG  3  RM  341 
NATICK  MA  01760-5020 

ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MY  DR  J  CHEN 
12423  RESEARCH  PARKWAY 
ORLANDO  FL  32826 

ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MS  MR  C  MANASCO 
SIGNAL  TOWERS  118  MORAN  HALL 
FORT  GORDON  GA  30905-5233 

ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MU  M  SINGAPORE 
6501  E  11  MILE  RD  MAIL  STOP  284 
BLDG  200A  2ND  FL  RM  2104 
WARREN  MI  48397-5000 

ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MF  MR  C  HERNANDEZ 
BLDG  3040  RM  220 
FORT  SILL  OK  73503-5600 

ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MW  E  REDDEN 
BLDG  4  ROOM  332 
FT  BENNING  GA  31905-5400 

ARMY  RSCH  LABORATORY  -  HRED 
ATTN  AMSRD  ARL  HR  MN  R  SPENCER 
DCSFDI  HF 

HQ  USASOC  BLDG  E2929 
FORT  BRAGG  NC  28310-5000 


1  ARMY  G1 

ATTN  DAPE  MR  B  KNAPP 

300  ARMY  PENTAGON  ROOM  2C489 

WASHINGTON  DC  20310-0300 

ABERDEEN  PROVING  GROUND 

1  DIRECTOR 

US  ARMY  RSCH  LABORATORY 
ATTN  AMSRD  ARL  Cl  OK  (TECH  LIB) 
BLDG  4600 

1  DIRECTOR 

US  ARMY  RSCH  LABORATORY 

ATTN  AMSRD  ARL  Cl  OK  TP  S  FOPPIANO 

BLDG  459 

1  DIRECTOR 

US  ARMY  RSCH  LABORATORY 

ATTN  AMSRD  ARL  HR  MR  F  PARAGALLO 

BLDG  459 
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